Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding Paper • 2601.10611 • Published about 19 hours ago • 11
CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation Paper • 2601.10061 • Published 1 day ago • 25
Openpi Comet: Competition Solution For 2025 BEHAVIOR Challenge Paper • 2512.10071 • Published Dec 10, 2025 • 17
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published Dec 9, 2025 • 130
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation Paper • 2512.10949 • Published Dec 11, 2025 • 45