PixelAsParam: A Gradient View on Diffusion Sampling with Guidance

Anh Dung Dinh, Daochang Liu, Chang Xu

Research output: Contribution to journalConference articlepeer-review

5 Citations (Scopus)

Abstract

Diffusion models recently achieved state-of-the-art in image generation. They mainly utilize the denoising framework, which leverages the Langevin dynamics process for image sampling. Recently, the guidance method has modified this process to add conditional information to achieve a controllable generator. However, the current guidance on denoising processes suffers from the trade-off between diversity, image quality, and conditional information. In this work, we propose to view this guidance sampling process from a gradient view, where image pixels are treated as parameters being optimized, and each mathematical term in the sampling process represents one update direction. This perspective reveals more insights into the conflict problems between updated directions on the pixels, which cause the trade-off as previously mentioned. We then investigate the conflict problems and propose to solve them by a simple projection method. The experimental results evidently improve over different baselines on datasets with various resolutions.

Original languageEnglish
Pages (from-to)8120-8137
Number of pages18
JournalProceedings of Machine Learning Research
Volume202
Publication statusPublished - 2023
Externally publishedYes
Event40th International Conference on Machine Learning - Honolulu, United States
Duration: 23 Jul 202329 Jul 2023

Fingerprint

Dive into the research topics of 'PixelAsParam: A Gradient View on Diffusion Sampling with Guidance'. Together they form a unique fingerprint.

Cite this