MARBLE: Multi-Aspect Reward Balance for Diffusion RL
AK(@_akhaliq)49 字 (约 1 分钟)
78
MARBLE proposes a multi-aspect reward balancing mechanism that significantly improves stability and performance in diffusion reinforcement learning across complex tasks, outperforming existing methods on multiple benchmarks.
入选理由:MARBLE 在 5 个复杂环境任务中平均提升策略成功率 23%
FeaturedTweet#Reinforcement Learning#Diffusion Models#Reward Design#AI Generation英文
