arxiv:2605.07630
Zhengyang Tang
tangzhy
AI & ML interests
None yet
Recent Activity
authored a paper 1 day ago
Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents submitted a paper 2 days ago
Safe, or Simply Incapable? Rethinking Safety Evaluation for Phone-Use Agents authored a paper 10 days ago
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows