Transition Matching Distillation for Fast Video Generation Paper • 2601.09881 • Published 3 days ago • 14
Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding Paper • 2601.10611 • Published 2 days ago • 16
CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation Paper • 2601.10061 • Published 2 days ago • 25
RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes Paper • 2601.05249 • Published 9 days ago • 44
3AM: Segment Anything with Geometric Consistency in Videos Paper • 2601.08831 • Published 4 days ago • 31
Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments Paper • 2601.01075 • Published 14 days ago • 4
Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models Paper • 2601.08955 • Published 4 days ago • 10
Efficient Camera-Controlled Video Generation of Static Scenes via Sparse Diffusion and 3D Rendering Paper • 2601.09697 • Published 3 days ago • 6
OpenVoxel: Training-Free Grouping and Captioning Voxels for Open-Vocabulary 3D Scene Understanding Paper • 2601.09575 • Published 3 days ago • 24
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning Paper • 2601.09708 • Published 3 days ago • 44
MemoBrain: Executive Memory as an Agentic Brain for Reasoning Paper • 2601.08079 • Published 4 days ago • 34
MemGovern: Enhancing Code Agents through Learning from Governed Human Experiences Paper • 2601.06789 • Published 6 days ago • 73
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands Paper • 2512.24965 • Published 17 days ago • 39
Parallel Context-of-Experts Decoding for Retrieval Augmented Generation Paper • 2601.08670 • Published 4 days ago • 18
Semantically-Aware Rewards for Open-Ended R1 Training in Free-Form Generation Paper • 2506.15068 • Published Jun 18, 2025 • 14
ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs Paper • 2506.15211 • Published Jun 18, 2025 • 39