AI & ML interests
Building breatkthrough AI to solve the world's biggest problems.
Recent Activity
View all activity
Papers
Meta-Reinforcement Learning with Self-Reflection for Agentic Search
TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics
Organization Card
spaces 13
pinned
Running
21
AstaBench Leaderboard
🥇
View benchmark leaderboards
pinned
Running
422
Reward Bench Leaderboard
📐
Explore RewardBench model rankings and scores
pinned
Running
2
HREF Leaderboard
📐
Browse and search HREF leaderboard data
pinned
Running
91
Zebra Logic Bench
🦓
Show leaderboard and explore model puzzle results
pinned
Sleeping
3
SUPER Leaderboard
🤖
Display a static leaderboard from a JSON file
pinned
Running
53
ZeroEval Leaderboard
📊
Embed ZeroEval for evaluation
models 858
allenai/ACE2-ERA5
Updated
• 71 • 15
allenai/Olmo-Hybrid-7B
Text Generation • Updated
• 22.4k • 45
allenai/Olmo-Hybrid-Think-SFT-7B
Text Generation • Updated
• 812 • 11
allenai/Olmo-Hybrid-Instruct-DPO-7B
Text Generation • 7B • Updated
• 3.23k • 15
allenai/Olmo-Hybrid-Instruct-SFT-7B
Text Generation • Updated
• 1.63k • 13
allenai/FlexOlmo-7x7B-1T-RT
Text Generation • 33B • Updated
• 133 • 7
allenai/FlexOlmo-7x7B-1T
Text Generation • 33B • Updated
• 301 • 40
allenai/Flex-public-7B-1T
Text Generation • 7B • Updated
• 297 • 5
allenai/Flex-reddit-2x7B-1T
Text Generation • 12B • Updated
• 5.65k • 7
allenai/Flex-pes2o-2x7B-1T
Text Generation • 12B • Updated
• 198 • 2
datasets 420
allenai/asta-bench-submissions
Updated
• 13 • 1
allenai/asta-summary-citation-counts
Viewer
• Updated
• 51.5M • 478 • 8
allenai/Sera-4.5A-Full-T1
Viewer
• Updated
• 48.3k • 104 • 1
allenai/Sera-4.5A-Lite-T1
Viewer
• Updated
• 24.5k • 73 • 3
allenai/Sera-4.6-Lite-T1
Viewer
• Updated
• 24.6k • 80 • 1
allenai/Sera-4.5A-Full-T2
Viewer
• Updated
• 44.5k • 108 • 1
allenai/Sera-4.5A-Lite-T2
Viewer
• Updated
• 24.5k • 97 • 3
allenai/Sera-4.6-Lite-T2
Viewer
• Updated
• 25.4k • 320 • 9
allenai/Sera-4.6-Lite-47000
Viewer
• Updated
• 31.3k • 332 • 1
allenai/Molmo2-VideoPoint
Viewer
• Updated
• 1.32M • 400 • 5