Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

26,788

Full-text search

Active filters: 8-bit

GadflyII/GLM-4.7-Flash-NVFP4

Text Generation • 18B • Updated 5 days ago • 153k • 38

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26, 2025 • 3.01M • • 4.38k

openai/gpt-oss-20b

Text Generation • 22B • Updated Aug 26, 2025 • 6.66M • • 4.24k

mlx-community/GLM-4.7-Flash-8bit

Text Generation • 30B • Updated about 1 hour ago • 4.36k • 15

mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-8bit

Text-to-Speech • 0.3B • Updated 3 days ago • 770 • 7

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 5.95k • 1.26k

MultiverseComputingCAI/HyperNova-60B

Text Generation • 60B • Updated 17 days ago • 1.42k • 48

mlx-community/GLM-4.7-Flash-8bit-gs32

Text Generation • 30B • Updated about 1 hour ago • 374 • 5

AlicanKiraz0/Mihenk-LLM-14B-Turkish-Financial-Model-mlx-8Bit

15B • Updated 9 days ago • 26 • 6

NVFP4/Qwen3-Coder-30B-A3B-Instruct-FP4

Text Generation • 16B • Updated Aug 5, 2025 • 3.31k • 6

openai/gpt-oss-safeguard-20b

Text Generation • 22B • Updated 11 days ago • 11k • • 181

Salyut1/GLM-4.7-NVFP4

Text Generation • 177B • Updated Dec 23, 2025 • 4.92k • 10

nvidia/DeepSeek-V3.2-NVFP4

Text Generation • 394B • Updated 5 days ago • 947 • 3

LiquidAI/LFM2.5-1.2B-Thinking-MLX-8bit

Text Generation • 0.3B • Updated 9 days ago • 168 • 3

lmstudio-community/GLM-4.7-Flash-MLX-8bit

Text Generation • 30B • Updated 3 days ago • 244k • 3

MaziyarPanahi/Mistral-7B-Instruct-Aya-101-GGUF

Text Generation • 7B • Updated Feb 28, 2024 • 219 • 12

ragraph-ai/stable-cypher-instruct-3b

Text Generation • 3B • Updated Jun 12, 2025 • 355 • 31

MaziyarPanahi/Qwen2.5-1.5B-Instruct-GGUF

Text Generation • 2B • Updated Sep 18, 2024 • 145k • 9

tiiuae/Falcon-E-3B-Instruct

Text Generation • 0.9B • Updated Oct 7, 2025 • 293 • 36

drwlf/medgemma-4b-it-abliterated

Text Generation • Updated Jul 21, 2025 • 15 • 6

nvidia/Qwen3-30B-A3B-NVFP4

Text Generation • 16B • Updated Sep 10, 2025 • 32.8k • 21

nvidia/Qwen3-8B-NVFP4

Text Generation • 5B • Updated Sep 9, 2025 • 4.97k • 12

nvidia/Qwen2.5-VL-7B-Instruct-NVFP4

Text Generation • 5B • Updated Dec 6, 2025 • 3.13k • 12

FabioSarracino/VibeVoice-Large-Q8

Text-to-Audio • 9B • Updated Oct 1, 2025 • 2.69k • 78

Firworks/NVIDIA-Nemotron-3-Nano-30B-A3B-nvfp4

18B • Updated 19 days ago • 2.01k • 7

mlx-community/GLM-4.7-8bit

Text Generation • 353B • Updated Dec 23, 2025 • 1.21k • 4

Tengyunw/MiniMax-M2.1-NVFP4

Text Generation • 115B • Updated 19 days ago • 183 • 6

mlx-community/translategemma-27b-it-8bit

Text Generation • 27B • Updated 10 days ago • 964 • 3

nightmedia/Qwen3-32B-Element5-Heretic-qx86-hi-mlx

Text Generation • 33B • Updated 6 days ago • 212 • 2

arcee-ai/Trinity-Nano-Preview-MLX-8bit

Text Generation • 6B • Updated 6 days ago • 64 • 2