YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

SmolLM2 HelpBot 135M - GGUF

GGUF format of Mitchins/smollm2-helpbot-135M for use with Ollama and LM Studio.

Model Details

Base Model: HuggingFaceTB/SmolLM2-135M
Fine-tuned on: Self-help and conversational tasks
License: Apache 2.0
Architecture: LLaMA
Context Length: 8192 tokens
Embedding Dimension: 576
Heads: 9 (with 3 key-value heads)
Parameters: 135M

Model Format

Format	Size	Notes
`smollm2-helpbot-135m.gguf`	258 MB	F16 (Full Precision) - Best quality, optimal for LM Studio & Ollama

Usage with Ollama

# Download the model and create a Modelfile
cat > Modelfile << EOF
FROM ./smollm2-helpbot-135m.gguf
EOF

# Import the model
ollama create smollm2-helpbot-135m -f Modelfile

# Run the model
ollama run smollm2-helpbot-135m

Usage with LM Studio

Download smollm2-helpbot-135m.gguf
Open LM Studio
Click "Load Model" and select the downloaded file
Start chatting!

The model loads instantly and maintains coherency due to F16 precision.

Inference Example

from llama_cpp import Llama

llm = Llama(model_path="smollm2-helpbot-135m.gguf", n_ctx=8192)

prompt = "Human: How can I improve my confidence?\n\nAssistant:"
output = llm(prompt, max_tokens=512)
print(output['choices'][0]['text'])

Installation (llama-cpp-python)

pip install llama-cpp-python

Model Performance

F16 precision provides:

✅ Full coherency - No compression loss
✅ Fast inference - Optimized GGUF format
✅ Small file size - 258 MB (tiny model)
✅ Universal compatibility - Works with any GGUF loader

Perfect for:

Local inference on consumer hardware
Running in Ollama on all systems
LM Studio with excellent quality
Embedded applications with 256 MB+ available RAM

Original Model Information

This is a GGUF conversion of Mitchins/smollm2-helpbot-135M.

For more details about the original fine-tuning, please visit the original model card.

License

Apache License 2.0 - See LICENSE for details

Downloads last month: 4

GGUF

Model size

0.1B params

Architecture

llama

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support