Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

22,694

Base only

Active filters: grpo

mudasir13cs/qwen25-vl-3b-floorplan-grpo

Image-to-Text • Updated 26 days ago • 444 • 8

Lazarus-Ai/ReAligned-Qwen3.5-35B-A3B

Text Generation • 35B • Updated 2 days ago • 69 • 5

alireza7/GrepSeek-Qwen3.5-9B-GRPO

Text Generation • 9B • Updated 8 days ago • 474 • 3

IndexTeam/MEDEA

Visual Question Answering • Updated 1 day ago • 3

Lazarus-Ai/ReAligned-Qwen3.5-27B-GGUF

Text Generation • 27B • Updated 2 days ago • 2.41k • 4

olaverse/MIST-Mini-8B-Thinking

Text Generation • 8B • Updated 6 days ago • 201 • • 2

Blancy/Qwen3-0.6B-Open-R1-GRPO

Text Generation • 2B • Updated Jul 17, 2025 • 2 • 1

enzii/Qwen3-4B-Instruct-TLDR-GRPO

Text Generation • 196k • Updated Aug 10, 2025 • 3 • 2

tahamajs/Qwen3-4b-gsm8k-Qlora-GRPO

Text Generation • Updated Aug 17, 2025 • 4 • 2

aquiffoo/neo-3-1B-A90M-Instruct

Text Generation • Updated Apr 17 • 4

mendicant04/DermoGPT-RL

Image-Text-to-Text • 9B • Updated 22 days ago • 644 • 7

shannon-ai/shannon-1.6-pro

Text Generation • Updated Feb 18 • 2

wei25/qwen3-0.6b-medmcqa-grpo

Updated Feb 23 • 1

Lazarus-Ai/ReAligned-Qwen3.5-0.8B-FP8

Text Generation • 0.9B • Updated 2 days ago • 40 • 2

Ayansk11/FinSenti-Qwen3-1.7B

Text Generation • 2B • Updated about 1 month ago • 226 • 1

Ayansk11/FinSenti-Qwen3-4B

Text Generation • 4B • Updated about 1 month ago • 226 • 1

Kabs-123/clustermind-lora

Reinforcement Learning • Updated Apr 26 • 2 • 1

anicka/geometric-euphorics

Text Generation • Updated 28 days ago • 1

lastmass/Qwen3.5-Medical-GSPO

Image-Text-to-Text • 5B • Updated 3 days ago • 5.02k • 8

Lazarus-Ai/ReAligned-Qwen3.5-2B-GGUF

Text Generation • 2B • Updated 2 days ago • 1.78k • 2

Lazarus-Ai/ReAligned-Qwen3.5-35B-A3B-GGUF

Text Generation • 35B • Updated 2 days ago • 2.29k • 2

gabriel-xiong/qwen3-8b-grpo-v2-epoch2

8B • Updated 4 days ago • 7 • 1

Chun121/Qwen3-4B-RPG-Roleplay-V2

Text Generation • 4B • Updated Aug 24, 2025 • 16.3k • 57

onuryozcu/llama

Text Generation • 0.1B • Updated Mar 10, 2025 • 70

amiguel/promptTuning

8B • Updated Feb 16, 2025 • 2

sergiopaniego/Qwen2-0.5B-GRPO-test

Updated Oct 3, 2025

Novaciano/ESP-NSFW-GRPO-1B-Sin_Censura-GGUF

1B • Updated Jan 28, 2025 • 182 • 5

nbd22/Llama-3.1-8B-Instruct-GRPO-gsm8k-ft-lora

Updated Jan 28, 2025

sergiopaniego/Qwen2-0.5B-GRPO

Updated Jan 31, 2025

philschmid/qwen-2.5-3b-r1-countdown

Text Generation • 3B • Updated Jan 30, 2025 • 24 • 8