Submitted by akhaliq 47 ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback · 7 authors 2
Submitted by akhaliq 43 OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments · 17 authors 1
Submitted by akhaliq 41 RecurrentGemma: Moving Past Transformers for Efficient Open Language Models · 62 authors 2
Submitted by akhaliq 30 Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models · 11 authors 3
Submitted by akhaliq 29 Best Practices and Lessons Learned on Synthetic Data for Language Models · 11 authors 1
Submitted by akhaliq 20 WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents · 5 authors 2
Submitted by akhaliq 12 Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models · 6 authors 1