T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search Paper • 2603.22341 • Published 5 days ago • 29
Effective Strategies for Asynchronous Software Engineering Agents Paper • 2603.21489 • Published 4 days ago • 5
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published 9 days ago • 78
Reasoning over mathematical objects: on-policy reward modeling and test time aggregation Paper • 2603.18886 • Published 7 days ago • 5
ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents Paper • 2603.18815 • Published 7 days ago • 12
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 7 days ago • 58
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published 10 days ago • 148
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data Paper • 2603.09206 • Published 16 days ago • 52
Can Large Language Models Keep Up? Benchmarking Online Adaptation to Continual Knowledge Streams Paper • 2603.07392 • Published 19 days ago • 18
XSkill: Continual Learning from Experience and Skills in Multimodal Agents Paper • 2603.12056 • Published 14 days ago • 32
EvoScientist: Towards Multi-Agent Evolving AI Scientists for End-to-End Scientific Discovery Paper • 2603.08127 • Published 17 days ago • 15
OfficeQA Pro: An Enterprise Benchmark for End-to-End Grounded Reasoning Paper • 2603.08655 • Published 17 days ago • 3
AutoResearch-RL: Perpetual Self-Evaluating Reinforcement Learning Agents for Autonomous Neural Architecture Discovery Paper • 2603.07300 • Published 19 days ago • 17
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 23 days ago • 100