Paper: A UNIVERSITY-LEVEL BENCHMARK FOR EVALUATING MATHEMATICAL SKILLS IN LLMS
Toloka
company
Verified
AI & ML interests
Human-expert data for frontier reasoning, safety and agentic AI
Recent Activity
Organization Card
Hey, this is Toloka!
datasets 14
toloka/homer-v2
Viewer • Updated • 765 • 8
toloka/HomER
Viewer • Updated • 63 • 110 • 1
toloka/mu-math
Viewer • Updated • 1.08k • 52 • 24
toloka/u-math
Viewer • Updated • 1.1k • 144 • 27
toloka/vist
Viewer • Updated • 39.3k • 1.17k
toloka/VOX-DUB
Viewer • Updated • 7.58k • 320 • 11
toloka/JEEM
Viewer • Updated • 2.2k • 121 • 14
toloka/beemo
Viewer • Updated • 2.19k • 449 • 19
toloka/CLESC
Viewer • Updated • 500 • 31 • 2
toloka/VoxDIY-RusNews
Updated • 56 • 3