Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts? Paper • 2503.18018 • Published Mar 23, 2025 • 7
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7, 2025 • 190
Reasoning Models Struggle to Control their Chains of Thought Paper • 2603.05706 • Published 29 days ago • 36
Running 91 Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks 📝 91 Evaluate multilingual models using FineTasks