person profile

Leigang Qu

Leigang Qu — researcher or builder tracked in the Angestrom contributor network.

7Connections

1Papers

0Models

0Repos

0News

Papers · 1

Optimizing Visual Generative Models via Distribution-wise Rewards

Conventional reinforcement learning strategies for visual generation typically employ sample-wise reward functions, yet this practice frequently results in reward hacking that degrades image diversity and introduces visual anomalies. To address these limitations, we present a novel framework that finetunes generative models using distribution-wise rewards, ensuring better alignment with real-world data distributions. Unlike rewards that evaluate samples individually, distribution-wise reward accounts for the data distribution of the samples, mitigating the mode collapse problem that occurs whe