KC-ML2 | ML2 Machine Learning Lab

#Computer Vision

Group Convolutional Self-Attention for Roto-Translation Equivariance in ViTs

NeurIPS Worshops 2025

We examine the challenges involved in achieving roto-translation equivariance in vision transformers, and propose a simpler way to implement roto-translation equivariant vision transformers (ViTs). We develop discrete roto-translation group equivariant self-attention without position encoding using convolutional patch embedding and convolutional self-attention. 
Our results demonstrate the competitive performance of our approach in comparison to the existing approaches with significantly smaller model sizes and complexity. 

Sheir A. Zaheer, Alexander C. Holston, Chan Y. Park

#Computer Vision

ErA: Error-Aware Deep Unrolling Network for Single Image Defocus Deblurring

NeurIPS Worshops 2025

We present ErA (Error-Aware Deep Unrolling Network), a new end-to-end method for restoring sharp images from a single defocused photo. ErA learns both a compact set of blur kernels and pixel-wise kernel weights, and uses an error-aware Augmented Lagrangian unrolling strategy to correct kernel estimation mistakes through alternating updates and ResUNet-based denoising. 
TL;DR 
ErA is a new deep-unrolling network for single-image defocus deblurring that corrects kernel estimation errors on the fly.

Tu Vo, Chan Y. Park

#LLM #RAG #Open Source

MARU-Lang (Open-source RAG Chatbot Engine)

Open Source Summit Korea 2025

MARU is an open-source RAG (Retrieval-Augmented Generation) chatbot engine built for enterprise environments. The core principle behind MARU is effective integration with existing corporate data — the key to any successful enterprise RAG system.
To provide seamless user experiences and easy compatibility with enterprise infrastructure, MARU is designed to align with corporate document management and access systems. We open-sourced MARU to help developers who face similar real-world challenges in enterprise AI integration.

Jinmyoung Lee, Sunyoung Park, Jihoon Kim

presentation video

#Computer Vision

Deep Joint Unrolling for Deblurring and Low-Light Image Enhancement (JUDE)

WACV 2025

Low-light and blurring issues are prevalent when capturing photos at night, often due to the use of long exposure to address dim environments. Addressing these joint problems can be challenging and error-prone if an end-to-end model is trained without incorporating an appropriate physical model. In this paper, we introduce JUDE, a Deep Joint Unrolling for Deblurring and Low-Light Image Enhancement, inspired by the image physical model. Based on Retinex theory and the blurring model, the low-light blurry input is iteratively deblurred and decomposed, producing sharp low-light reflectance and illuminance through an unrolling mechanism. Additionally, we incorporate various modules to estimate the initial blur kernel, enhance brightness, and eliminate noise in the final image. Comprehensive experiments on LOL-Blur and Real-LOL-Blur demonstrate that our method outperforms existing techniques both quantitatively and qualitatively.

Tu Vo, Chan Y. Park

#Robotics #ML Applications

Enhancing OCR-based Indoor Place Recognition with Visitor Map Image by Mitigating Noise from Distracting Words

IROS 2024

We propose an indoor place recognition method using only a visitor map. A visitor map image serves as both the map and the database. No additional mapping or exploration is needed. OCR(Optical Character Recognition) can be an effective tool for extracting information from both camera images and map images. However, the camera image is cluttered with miscellaneous and distracting words, which disrupts accurate place recognition. The key contribution of our work is the enhancement of OCR-based place recognition performance through the aggregation of multiple likelihoods, which effectively addresses the issue of distracting words.

Chaehyeuk Lee, Jinmyoung Lee, Sheir A. Zaheer, Seula Lee, Chan Y. Park

#Reinforcement Learning #ML Applications

Meent: Differentiable Electromagnetic Simulator for Machine Learning

2024.06.11

Meent is a Python-based Electromagnetic (EM) simulation package consisting of three main components: Modeling, EM Simulation, and Optimization. It employs Rigorous Coupled-Wave Analysis (RCWA) and supports Automatic Differentiation (AD), enabling seamless integration of machine learning (ML) and optics research.To demonstrate its versatility as a research platform, we present three applications: (1) generating datasets for training neural operators,(2) serving as an environment for reinforcement learning-based nanophotonic device optimization, and(3) solving inverse problems using gradient-based optimizers.

Yongha Kim, Anthony W. Jung, Sanmun Kim, Kevin Octavian, Doyoung Heo, Chaejin Park, Jeongmin Shin, Sunghyun Nam, Chanhyung Park, Juho Park,Sangjun Han, Jinmyoung Lee, Seolho Kim, Min Seok Jang, Chan Y. Park

#Reinforcement Learning #ML Applications

Sample-efficient inverse design of freeform nanophotonic devices with physics-informed reinforcement learning

Nanophotonics 2024

Finding an optimal device structure in the vast combinatorial design space of freeform nanophotonic design has been an enormous challenge. In this study, we propose physics-informed reinforcement learning (PIRL) that combines the adjoint-based method with reinforcement learning to improve the sample efficiency by an order of magnitude compared to conventional reinforcement learning and overcome the issue of local minima. To illustrate these advantages of PIRL over other conventional optimization algorithms, we design a family of one-dimensional metasurface beam deflectors using PIRL, exceeding most reported records. We also explore the transfer learning capability of PIRL that further improves sample efficiency and demonstrate how the minimum feature size of the design can be enforced in PIRL through reward engineering. With its high sample efficiency, robustness, and ability to seamlessly incorporate practical device design constraints, our method offers a promising approach to highly combinatorial freeform device optimization in various physical domains.

Chaejin Park, Sanmun Kim, Anthony W. Jung, Juho Park, Dongjin Seo, Yongha Kim, Chanhyung Park, Chan Y. Park and Min Seok Jang

#Computer Vision

In-Season Wall-to-Wall Crop-Type Mapping Using Ensemble of Image Segmentation Models

IEEE Transactions on Geoscience and Remote Sensing (Volume 61)

2023.12.01

This work applies computer vision to multispectral satellite images to create pre-harvest agricultural maps at large scales. We demonstrate the effectiveness of our work not only by outperforming existing approaches in terms of evaluation metrics but also by generating wall-to-wall corn and soybean maps across the entire US corn-belt, spanning over 2 million square kilometers. We are able to generate highly accurate maps approx. six months before the corresponding year's data is released by the USDA. Such preharvest maps can be very useful for forecasting yields and mitigating any supply shortages.

Sheir A. Zaheer, Youngryel Ryu, Junghee Lee, Zilong Zhong, Kyungdo Lee

#Computer Vision

RCV2023 Challenges: Benchmarking Model Training and Inference for Resource-Constrained Deep Learning

ICCV 2023 workshop

2023.10.02

This paper delves into the results of two resource-constrained deep learning challenges, part of the workshop on Resource-Efficient Deep Learning for Computer Vision (RCV) at ICCV 2023, focusing on memory and time limitations. The challenges garnered significant global participation and showcased a range of intriguing solutions. The paper outlines the problem statements for both tracks, summarizes baseline and top-performing approaches, and provides a detailed analysis of the methods used. While the presented solutions constitute promising initial progress, they represent the beginning of efforts needed to address this complex issue. We conclude by emphasizing the importance of sustained research efforts to fully address the challenges of resource-constrained deep learning.

Rishabh Tiwari, Arnav Chavan, Deepak Gupta, Gowreesh Mago, Animesh Gupta, Akash Gupta, Suraj Sharan, Yukun Yang, Shanwei Zhao, Shihao Wang, Youngjun Kwak, Seonghun Jeong, Yunseung Lee, Changick Kim, Subin Kim, Ganzorig Gankhuyag, Ho Jung, Junwhan Ryu, HaeMoon Kim, Byeong H. Kim, Tu Vo, Sheir Zaheer, Alexander Holston, Chan Park, Dheemant Dixit, Nahush Lele, Kushagra Bhushan, Debjani Bhowmick, Devanshu Arya, Sadaf Gulshad, Amirhossein Habibian, Amir Ghodrati, Babak Bejnordi, Jai Gupta, Zhuang Liu, Jiahui Yu, Dilip Prasad, Zhiqiang Shen

#Graph #Open Source #ML Applications

LLVM-FLOW

2022.12.15

One way to understand and debug IR optimization process is to visualize Control Flow Graphs (CFGs) before and after optimization and compare them. However, comparing the CFGs can be challenging since they can vary significantly. LLVM-FLOW, an open-source interactive CFG visualization web app, is developed to ease the difficulty by highlighting the same components in two graphs. With this tool, users can easily find corresponding nodes in another graph by clicking the highlighted node. 
LLVM-FLOW is a useful tool not only for experienced LLVM developers seeking to better understand the IR flow when writing custom passes, but also for newcomers to the LLVM ecosystem who wish to study the behavior of IR patterns.

Jinmyoung Lee

presentation video

#ML Applications

Free-form optimization of nanophotonic devices: from classical methods to deep learning

Nanophotonics 2022

2022.01.12

Nanophotonic devices have enabled microscopic control of light with an unprecedented spatial resolution by employing subwavelength optical elements that can strongly interact with incident waves. However, to date, most nanophotonic devices have been designed based on fixed-shape optical elements, and a large portion of their design potential has remained unexplored. It is only recently that free-form design schemes have been spotlighted in nanophotonics, offering routes to make a break from conventional design constraints and utilize the full design potential. In this review, we systematically overview the nascent yet rapidly growing field of

Juho Park, Sanmun Kim, Daniel Wontae Nam, Haejun Chung, Chan Y. Park and Min Seok Jang.

#Reinforcement Learning #ML Applications

Structural optimization of a one-dimensional freeform metagrating deflector via deep reinforcement learning

ACS Photonics 2022

The increasing demand on a versatile high-performance metasurface requires a freeform design method that can handle a huge design space, which is many orders of magnitude larger than that of conventional fixed-shape optical structures. In this work, we formulate the designing process of one-dimensional freeform Si metasurface beam deflectors as a reinforcement learning problem to find their optimal structures consistently without requiring any prior metasurface data. During training, a deep Q-network-based agent stochastically explores the device design space around the learned trajectory optimized for deflection efficiency. The devices discovered by the agents show overall improvements in maximum efficiency compared to the ones that state-of-the-art baseline methods find at various wavelengths and deflection angles.

D. Seo, D. W. Nam, J. Park, C. Y. Park, and M. S. Jang

#ML Applications

Inverse design of organic light-emitting diode structure based on deep neural networks

Nanophotonics 2021

2021.11.04

The optical properties of thin-film light emitting diodes (LEDs) are strongly dependent on their structures due to light interference inside the devices. However, the complexity of the design space grows exponentially with the number of design parameters, making it challenging to optimize the optical properties of multilayer LEDs with rigorous electromagnetic simulations. In this work, we demonstrate an artificial neural network that can predict the light extraction efficiency of an organic LED structure in 30 ms, which is ∼103 times faster than the rigorous simulation in a single-treaded execution with root-mean-squared error

Sanmun Kim, Jeong Min Shin, Jaeho Lee, Chanhyung Park, Songju Lee, Juho Park, Dongjin Seo, Sehong Park, Chan Y. Park and Min Seok Jang

#Reinforcement Learning #Publications

GMAC: A Distributional Perspective on Actor-Critic Framework

ICML 2021

Reinforcement Learning (RL) has become one of the major categories in the field of machine learning in the recent years via the breakthroughs such as approximation of complex non-linear function through deep neural networks. Among these, a distributional perspective on the value function estimation has contributed on taking a big jump in the performance of RL algorithms. However, proper discussions of distributional RL (DRL) are still limited to specific algorithms or network architectures such as Q-learning or deterministic policy gradient. In the situation at hand, we have worked to address some of the critical aspects in RL that was left out in the distributional perspective. The details and the findings of this journey can be found in our recent work 'GMAC: A Distributional'

Daniel Wontae Nam, Younghoon Kim, Chan Y. Park

presentation video

#Reinforcement Learning #Open Source

MAS tutorials

2021.06.15

Preface 이 튜토리얼은 Multi-agent 문제를 Deep Reinforcement Learning (DRL) 의 관점으로 바라보며 Snake Leaderboard를 사용함에 있어 multi-agent문제의 접근법을 더욱 확장시키고 싶은 분들을 위한 정보를 제공합니다. 특히나 강화학습의 측면에서의 multi-agent 문제를 다루기 위해서는 아주 많은 이론적 배경지식을 요구하지만 이 튜토리얼에서는 최대한 힘을 빼고 조금 더 개념적으로 접근하려합니다. 만약 튜토리얼 전체를 하나의 개념으로 요약한다면 free-for-all 한 상황에서의 Multi-agent Deep Reinforcement Learning (MDRL)으로 요약할 수 있을텐데 본론으로 들어가기에 앞서 이 MDRL을 한 단어씩 떼어서 살펴보는 것으로 시작하겠습니다. 우선 MDRL은 크게 3가지의 개념으로 나눌 ...

Team ML2

#Reinforcement Learning #Open Source

Reinforcement learning library (RL2)

2020.10.28

Reinforcement Learning Library for deep RL algorithms A opensource library built to provide a RL algorithm development framework. RL2 is built on the philosophy to unify different RL algorithm under a generalized core, to recycle codes across algorithms and make modifications to algorithms easy. Structure Simplified layout of components that consists the RL2 structure. Worker : Governs the env-agent interaction loop Env : Environment object Agent : Algorithm specific agent that governs information from/to the environment Model : Takes care of everything related to inferring and updating the neural network ...

Team ML2

#Reinforcement Learning #Open Source

ML2 multiagent RL mini-game environments

2020.04.28

1. Introduction This post is an introduction to a simple multi-agent reinforcement learning environment, ML2-MARL-ENV, that can be used to train multi-agent RL algorithms. It aims to explain the use cases of the environment to help future RL researchers in training a multi-agent RL. Another purpose of this blog is to share my personal experiences that I have come across in the development stage, which will hopefully help others better understand the nature of the project. The API of the environment follows that of the convention of OpenAI gym environment. 2. Recent research and motivations 2.1 Backgrounds Deep ...

Taemin HA

#Graph #Open Source

LLVM-Block

2020.04.17

1. About LLVM IR (Intermediate Representation) LLVM 은 SSA-based 컴파일러입니다. SSA(Static Single Assignment)는 변수가 한번만 할당되도록 하여 변수의 복잡성을 줄이는 것입니다. LLVM의 큰 장점은 원하는 기능을 새로 추가하기 쉬운 것입니다. 오픈소스이면서 API가 잘 문서화 되어있고, 기능이 라이브러리로 분리되어 동작하기 때문입니다. LLVM의 코어 라이브러리는 Optimizer이고 Optimizer의 소스와 타겟에 독립적인 중간 표현인 intermediate representation(IR)을 받아 최적화하여 새로운 IR을 도출합니다. 현실의 CPU는 레지스터가 한정되어 있지만, IR은 타겟의 레지스터 수와 관계가 없어야 하기 때문에 가상 레지스터의 수를 계속 늘리면서 작성됩니다. LLVM에서의 SSA는 레지스터에 ...

Sooyeon LEE

#Computer Vision

Investigating Pixel Robustness using Input Gradients

2019.08.30

This post aims to cover main concepts from the paper 'Where to be Adversarial Perturbations Added? Investigating and Manipulating Pixel Robustness using Input Gradients' by Hwang et al. The paper connects the gradients of input features to the robustness of a classification model, and shows that the robustness can be manipulated indirectly through changing the gradient flows within the model. Adversarial attack can be defined as a process of generating adversarial examples to a given classifier, which are samples that are misclassified by the model but are only slightly different from correctly classified samples drawn from the data distribution. Projected Gradient Descent (PGD) is a popular attack method that iteratively generates adversarial examples as the following

Jisung Hwang, Younghoon Kim, Sanghyuk Chun, Jaejun Yoo, Ji-Hoon Kim & Dongyoon Han

#Reinforcement Learning

Distilling Curiosity for Exploration

2019.07.29

This post is an introduction to the paper 'Curiosity Bottleneck: Exploration by Distilling Task-Specific Novelty' by Kim et al. The paper deals with informative exploration method when task-irrelevant noise are present within the visual observation. By distilling the informative from the uninformative, the agent is able to successfully ignore the distractive visual entities when making decision about choice of action or calculating the intrinsic reward for exploration. Exploration vs. exploitation is a well known paradox in reinforcement learning. A careful tradeoff between the two is required for the optimal performance of the learning algorithms.

Youngjin Kim, Wontae Nam, Hyunwoo Kim, Ji-Hoon Kim, Gunhee Kim