CLE Diffusion: Controllable Light Enhancement Diffusion Model(MM 2023)

Yuyang Yin1,2, Dejia Xu3, Chuangchuang Tan1,2, Ping Liu4, Yao Zhao1,2, Yunchao Wei1,2,

1Institute of Information Science, Beijing Jiaotong University

2Beijing Key Laboratory of Advanced Information Science and Network Technology

3VITA Group, University of Texas at Austin

4Center for Frontier AI Research, IHPC, A*STAR

Abstract

Low light enhancement has gained increasing importance with the rapid development of visual creation and editing. However, most existing enhancement algorithms are designed to homogeneously increase the brightness of images to a pre-defined extent, limiting the user experience. To address this issue, we propose Controllable Light Enhancement Diffusion Model, dubbed CLE Diffusion, a novel diffusion framework to provide users with rich controllability. Built with a conditional diffusion model, we introduce an illumination embedding to let users control their desired brightness level. Additionally, we incorporate the Segment-Anything Model (SAM) to enable user-friendly region controllability, where users can click on objects to specify the regions they wish to enhance. Extensive experiments demonstrate that CLE Diffusion achieves competitive performance regarding quantitative metrics, qualitative results, and versatile controllability.

Method

MY ALT TEXT

During training, we randomly sample a pair of low-light image `x` and normal-light image `y`. We then construct `y_t`, color map `C(x)`, and snr map `S(x)` as additional inputs to the diffusion model. We extract brightness level `\lambda` of normal-light image by cacluating the average pixel value. Then `\lambda` is injected into the Brightness Control Modules to enable seamless and consistent brightness control. Alongside $L_\text{simple}$, we introduce auxiliary losses on the denoised estimate `\hat{y_0}` to provide better supervision for the model.

To achieve regional controllability, We incorporate a binary mask `M` into our diffusion model by concatenating the mask with the original inputs. To accommodate this requirement, we created synthetic training data by randomly sampling free-form masks with feathered boundaries. The target images are generated by alpha blending the low-light and normal-light images from existing low-light datasets.

MY ALT TEXT

The sampling process is implemented with DDIM sampler. We use classifier free guide method to estimate two noise from a conditional model and a unconditional model. Armed with SAM, CLE Diffusion achieve light enhancement with specified regions and designated levels of brightness.

Qualitative Results

MY ALT TEXT

Visual Results

MY ALT TEXT

Figure 1: CLE Diffusion enables users to select regions of interest(ROI) with a simple click and adjust the degree of brightness enhancement as desired, while MAXIM [1] is limited to homogeneously enhancing images to a pre-defined level of brightness.

MY ALT TEXT

Figure 2: More cases about region controllable light enhancement. Equipped with the Segment-Anything Model (SAM), users can designate regions of interest (ROI) using simple inputs like points or boxes. Our model facilitates controllable light enhancement within these regions, producing results that blend naturally and seamlessly with the surrounding environment.

MY ALT TEXT

Figure 3: Visual results of global brightness control on LOL dataset. By adjusting the brightness levels during inference, we can sample images with varying degrees of brightness while maintaining high image quality

MY ALT TEXT

Figure 4: Global Controllable Light Enhancement on MIT-Adobe FiveK dataset. Our method enables users to select various brightness levels, even significantly brighter than the ground truth.

MY ALT TEXT

Figure 5: Results on LOL [2] test dataset. Our result exhibits fewer artifacts and is more consistent with the ground truth image.

MY ALT TEXT

Figure 6: Results on MIT-Adobe FiveK [3] test dataset. Our result exhibits less color distortion and contains richer details, which are more consistent with the ground truth.

MY ALT TEXT

Figure 7: Comparisons on a real-world image from VE-LOL dataset [4]. Other methods often rely on well-lit brightness extracted from pre-existing datasets, limiting their applicability to diverse scenarios. Unlike most methods that struggle to enhance the brightness at night sufficiently, our method incorporates a brightness control module, allowing us to sample images with higher brightness that appear more natural in such situations.

MY ALT TEXT

Figure 8: Performance on normal light image inputs. We utilize the normal-light images from the LOL dataset as inputs to evaluate the models’ capability in handling high-light images. HWMNet and MAXIM exhibit overexposure in certain regions, resulting in considerably over-exposed images. LLFlow produces blurred images, while other methods result in color distortion. Our method achieves visually pleasing results in terms of color and brightness.

MY ALT TEXT

Figure 9: Global brightness control compared with ReCoRo [5]. While ReCoRo is constrained to enhancing images with brightness levels that fall between low-light and “well-lit” images, our model can handle a wider range of brightness levels. It can be adjusted to sample any desired brightness, providing greater flexibility and control over different lighting conditions.

MY ALT TEXT

Figure 10: Performance on Segment-Anything model. SAM generates coarse masks when dealing with images captured in low-light conditions. After enhancing with our model, the images can be effectively segmented even in dark environments. This demonstrates our model’s ability to restore details that are friendly to high-level machine vision models.

Bibtex


  @article{yin2023cle,
  title={CLE Diffusion: Controllable Light Enhancement Diffusion Model},
  author={Yin, Yuyang and Xu, Dejia and Tan, Chuangchuang and Liu, Ping and Zhao, Yao and Wei, Yunchao},
  journal={arXiv preprint arXiv:2308.06725},
  year={2023}
  }

      

References


  [1] Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, and Yinxiao Li. 2022. Maxim: Multi-axis mlp for image processing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.5769–5780.
  [2] Wei Chen, Wang Wenjing, Yang Wenhan, and Liu Jiaying. 2018. Deep Retinex Decomposition for Low-Light Enhancement. In British Machine Vision Conference.British Machine Vision Association.
  [3] Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. 2011. Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs. In The Twenty-Fourth IEEE Conference on Computer Vision and Pattern Recognition.
  [4] Jiaying Liu, Dejia Xu, Wenhan Yang, Minhao Fan, and Haofeng Huang. 2021. Benchmarking low-light image enhancement and beyond. International Journal of Computer Vision 129 (2021), 1153–1184.
  [5] Dejia Xu, Hayk Poghosyan, Shant Navasardyan, Yifan Jiang, Humphrey Shi, and Zhangyang Wang. 2022. ReCoRo: Region-Controllable Robust Light Enhancement with User-Specified Imprecise Masks. In Proceedings of the 30th ACM International Conference on Multimedia. 1376–1386.