Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2603.14473

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published about 1 month ago • 99
ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

Paper • 2603.05863 • Published Mar 6 • 6
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published 30 days ago • 375
GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 149

AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 426

AI Can Learn Scientific Taste

AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 426
OpenMOSS-Team/SciJudgeBench

Preview • Updated Mar 17 • 193 • 8
OpenMOSS-Team/SciJudge-4B

Text Generation • 4B • Updated Mar 17 • 98 • • 6
OpenMOSS-Team/SciJudge-30B

Text Generation • 31B • Updated Mar 17 • 437 • 12

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published Jan 23 • 180
QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

Paper • 2602.07085 • Published Feb 6 • 190
A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 521
AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 426

GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21, 2025 • 135
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7, 2025 • 142
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published Aug 21, 2025 • 65
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22, 2025 • 162

Read Later 📚

Interesting papers on AI, LLMs, etc. to add to reading list

Monitored Markov Decision Processes

Paper • 2402.06819 • Published Feb 9, 2024
Generalization in Monitored Markov Decision Processes (Mon-MDPs)

Paper • 2505.08988 • Published May 13, 2025
Bayesian Risk Markov Decision Processes

Paper • 2106.02558 • Published Jun 4, 2021
Sotopia-RL: Reward Design for Social Intelligence

Paper • 2508.03905 • Published Aug 5, 2025 • 23

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 210
SII-Enigma/Llama3.2-8B-Ins-AMPO

Text Generation • 8B • Updated Mar 21 • 37
Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26, 2025 • 59
Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs

Paper • 2509.25779 • Published Sep 30, 2025 • 19

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

Paper • 2602.07085 • Published Feb 6 • 190
Seriki/FastHTML

Updated Mar 18 • 6 • 1
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 448
AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 426

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30, 2025 • 550
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 323
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 133
LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 177

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 153
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published about 1 month ago • 99
ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning

Paper • 2603.05863 • Published Mar 6 • 6
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published 30 days ago • 375
GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 149

Read Later 📚

Interesting papers on AI, LLMs, etc. to add to reading list

Monitored Markov Decision Processes

Paper • 2402.06819 • Published Feb 9, 2024
Generalization in Monitored Markov Decision Processes (Mon-MDPs)

Paper • 2505.08988 • Published May 13, 2025
Bayesian Risk Markov Decision Processes

Paper • 2106.02558 • Published Jun 4, 2021
Sotopia-RL: Reward Design for Social Intelligence

Paper • 2508.03905 • Published Aug 5, 2025 • 23

AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 426

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 210
SII-Enigma/Llama3.2-8B-Ins-AMPO

Text Generation • 8B • Updated Mar 21 • 37
Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26, 2025 • 59
Planner-R1: Reward Shaping Enables Efficient Agentic RL with Smaller LLMs

Paper • 2509.25779 • Published Sep 30, 2025 • 19

AI Can Learn Scientific Taste

AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 426
OpenMOSS-Team/SciJudgeBench

Preview • Updated Mar 17 • 193 • 8
OpenMOSS-Team/SciJudge-4B

Text Generation • 4B • Updated Mar 17 • 98 • • 6
OpenMOSS-Team/SciJudge-30B

Text Generation • 31B • Updated Mar 17 • 437 • 12

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

Paper • 2602.07085 • Published Feb 6 • 190
Seriki/FastHTML

Updated Mar 18 • 6 • 1
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 448
AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 426

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published Jan 23 • 180
QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

Paper • 2602.07085 • Published Feb 6 • 190
A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 521
AI Can Learn Scientific Taste

Paper • 2603.14473 • Published Mar 15 • 426

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30, 2025 • 550
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 323
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Paper • 2601.00393 • Published Jan 1 • 133
LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 177

GUI-G^2: Gaussian Reward Modeling for GUI Grounding

Paper • 2507.15846 • Published Jul 21, 2025 • 135
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7, 2025 • 142
Mobile-Agent-v3: Foundamental Agents for GUI Automation

Paper • 2508.15144 • Published Aug 21, 2025 • 65
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22, 2025 • 162

AI Paper of the Day

A collection of papers that I think are interesting, one added each day

Can Large Language Models Understand Context?

Paper • 2402.00858 • Published Feb 1, 2024 • 24
OLMo: Accelerating the Science of Language Models

Paper • 2402.00838 • Published Feb 1, 2024 • 85
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 153
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs