LEGO-Eval: Towards Fine-Grained Evaluation on Synthesizing 3D Embodied Environments with Tool Augmentation Paper • 2511.03001 • Published Nov 4 • 46
Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces Paper • 2510.06953 • Published Oct 8 • 8
One Missing Piece for Open-Source Reasoning Models: A Dataset to Mitigate Cold-Starting Short CoT LLMs in RL Paper • 2506.02338 • Published Jun 3 • 5
Interleaved Reasoning for Large Language Models via Reinforcement Learning Paper • 2505.19640 • Published May 26 • 14
Embodied Agents Meet Personalization: Exploring Memory Utilization for Personalized Assistance Paper • 2505.16348 • Published May 22 • 52
RLVR-World: Training World Models with Reinforcement Learning Paper • 2505.13934 • Published May 20 • 16
Evaluating Language Models as Synthetic Data Generators Paper • 2412.03679 • Published Dec 4, 2024 • 48
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation Paper • 2410.13232 • Published Oct 17, 2024 • 44
Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code Paper • 2409.19715 • Published Sep 29, 2024 • 11
VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models Paper • 2402.18374 • Published Feb 28, 2024 • 2
Coffee: Boost Your Code LLMs by Fixing Bugs with Feedback Paper • 2311.07215 • Published Nov 13, 2023 • 3
Dialogue Chain-of-Thought Distillation for Commonsense-aware Conversational Agents Paper • 2310.09343 • Published Oct 13, 2023 • 2