Inference Providers
Active filters: quantllm
Guilherme34/Firefly-v4-gguf-q4
5B • Updated • 166
• 1
Guilherme34/Firefly-v4-Q8-GGUF
5B • Updated • 201
• 1
codewithdark/Llama-3.2-3B-4bit
3B • Updated • 9
codewithdark/Llama-3.2-3B-GGUF-4bit
3B • Updated • 1
codewithdark/Llama-3.2-3B-4bit-mlx
Text Generation
• 3B • Updated • 66
QuantLLM/Llama-3.2-3B-4bit-mlx
Text Generation
• 3B • Updated • 16
QuantLLM/Llama-3.2-3B-2bit-mlx
Text Generation
• 3B • Updated • 27
QuantLLM/Llama-3.2-3B-8bit-mlx
Text Generation
• 3B • Updated • 30
QuantLLM/Llama-3.2-3B-5bit-mlx
Text Generation
• 3B • Updated • 15
QuantLLM/Llama-3.2-3B-5bit-gguf
3B • Updated • 42
QuantLLM/Llama-3.2-3B-2bit-gguf
3B • Updated • 34
QuantLLM/functiongemma-270m-it-8bit-gguf
0.3B • Updated • 12
• 1
QuantLLM/functiongemma-270m-it-4bit-gguf
0.3B • Updated • 18
QuantLLM/functiongemma-270m-it-4bit-mlx
Text Generation
• 0.3B • Updated • 10