CoVEBench: Can Video Editing Models Handle Complex Instructions? Paper • 2606.08415 • Published 26 days ago • 51
DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation Paper • 2604.14683 • Published Apr 16 • 36