Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

reinforcement-learning

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

71,176

Full-text search

Active filters: reinforcement-learning

ytu-ce-cosmos/Turkish-Gemma-4b-T1-Scout

Text Generation • 4B • Updated 8 days ago • 172 • 9

Adilbai/stock-trading-rl-agent

Reinforcement Learning • Updated Jan 8 • 327 • 119

LightningRodLabs/foresight-32B

Text Generation • 33B • Updated about 20 hours ago • 71 • 5

cheryyunl/Make-An-Agent

Reinforcement Learning • Updated Aug 13, 2024 • 6

mradermacher/Qwen3-14B-ARPO-DeepSearch-GGUF

Reinforcement Learning • 15B • Updated Aug 12, 2025 • 339 • 5

JonusNattapong/AI-XAUUSD-Trading

Reinforcement Learning • Updated Oct 10, 2025 • 23

exla-ai/openpie-0.6

Robotics • Updated Feb 4 • 17 • 12

shashuo0104/residual_copilot_models

Reinforcement Learning • Updated 9 days ago • 2

ValueFX9507/Tifa-DeepsexV2-7b-MGRPO-GGUF-Q8

Reinforcement Learning • 8B • Updated Mar 28, 2025 • 1.45k • 195

Open-Reasoner-Zero/Open-Reasoner-Zero-Critic-32B

Reinforcement Learning • 32B • Updated Apr 7, 2025 • 10 • 7

ulab-ai/Time-R1-S1P2

Text Generation • 3B • Updated Jun 2, 2025 • 8 • 2

ValueFX9507/Tifa-DeepsexV3-14b-GGUF-Q6

Reinforcement Learning • 15B • Updated Jul 1, 2025 • 785 • 40

infly/inf-retriever-v1-pro

Reinforcement Learning • 7B • Updated Feb 2 • 293 • 6

PrimeIntellect/INTELLECT-3

Text Generation • Updated Nov 27, 2025 • 725 • 208

zai-org/GLM-TTS

Text-to-Speech • Updated Jan 12 • 246 • 325

mradermacher/inf-retriever-v1-pro-GGUF

Reinforcement Learning • 7B • Updated Dec 18, 2025 • 247 • 1

nikhilchandak/OpenForecaster-8B

Text Generation • 8B • Updated Jan 7 • 120 • 17

eclipse-ai/Eclipse-ERRL-Qwen3-8B

Reinforcement Learning • Updated 5 days ago • 3

PrimeIntellect/INTELLECT-3.1

Text Generation • 107B • Updated 25 days ago • 1.21k • 38

OpenDataArena/ODA-Fin-RL-8B

Reinforcement Learning • 8B • Updated 5 days ago • 26 • 1

XiaoyuWen/MAGIC

Text Generation • Updated Feb 4 • 1

Flexan/DoodDood-TOMAGPT-GGUF

Text Generation • 4B • Updated 18 days ago • 479 • 1

MBZUAI/MediX-R1-2B

Image-Text-to-Text • 2B • Updated 15 days ago • 112 • 3

Rebixa/ppo-LunarLander-v3

Reinforcement Learning • Updated 13 days ago • 21 • 1

IntelligenceLab/MM-Zero-Logs

Reinforcement Learning • Updated 4 days ago • 1

MostLime/lcm-chess

Text Generation • 29.2M • Updated 2 days ago • 1

ValueFX9507/Tifa-Deepsex-14b-CoT-GGUF-Q4

Reinforcement Learning • 15B • Updated Feb 13, 2025 • 2.26k • 824

ValueFX9507/Tifa-Deepsex-14b-CoT

Reinforcement Learning • 15B • Updated Feb 13, 2025 • 7.59k • 220

ValueFX9507/Tifa-Deepsex-14b-CoT-Q8

Reinforcement Learning • 15B • Updated Feb 13, 2025 • 7.84k • 186

ThomasSimonini/ML-Agents-SnowballFight-1vs1

Reinforcement Learning • Updated Nov 30, 2021 • 18 • 10