ARLArena: A Unified Framework for Stable Agentic Reinforcement Learning Paper • 2602.21534 • Published about 1 month ago • 23
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models Paper • 2307.10635 • Published Jul 20, 2023 • 9