YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

SmolLM2 HelpBot 135M - GGUF

GGUF format of Mitchins/smollm2-helpbot-135M for use with Ollama and LM Studio.

Model Details

  • Base Model: HuggingFaceTB/SmolLM2-135M
  • Fine-tuned on: Self-help and conversational tasks
  • License: Apache 2.0
  • Architecture: LLaMA
  • Context Length: 8192 tokens
  • Embedding Dimension: 576
  • Heads: 9 (with 3 key-value heads)
  • Parameters: 135M

Model Format

Format Size Notes
smollm2-helpbot-135m.gguf 258 MB F16 (Full Precision) - Best quality, optimal for LM Studio & Ollama

Usage with Ollama

# Download the model and create a Modelfile
cat > Modelfile << EOF
FROM ./smollm2-helpbot-135m.gguf
EOF

# Import the model
ollama create smollm2-helpbot-135m -f Modelfile

# Run the model
ollama run smollm2-helpbot-135m

Usage with LM Studio

  1. Download smollm2-helpbot-135m.gguf
  2. Open LM Studio
  3. Click "Load Model" and select the downloaded file
  4. Start chatting!

The model loads instantly and maintains coherency due to F16 precision.

Inference Example

from llama_cpp import Llama

llm = Llama(model_path="smollm2-helpbot-135m.gguf", n_ctx=8192)

prompt = "Human: How can I improve my confidence?\n\nAssistant:"
output = llm(prompt, max_tokens=512)
print(output['choices'][0]['text'])

Installation (llama-cpp-python)

pip install llama-cpp-python

Model Performance

F16 precision provides:

  • โœ… Full coherency - No compression loss
  • โœ… Fast inference - Optimized GGUF format
  • โœ… Small file size - 258 MB (tiny model)
  • โœ… Universal compatibility - Works with any GGUF loader

Perfect for:

  • Local inference on consumer hardware
  • Running in Ollama on all systems
  • LM Studio with excellent quality
  • Embedded applications with 256 MB+ available RAM

Original Model Information

This is a GGUF conversion of Mitchins/smollm2-helpbot-135M.

For more details about the original fine-tuning, please visit the original model card.

License

Apache License 2.0 - See LICENSE for details

Downloads last month
4
GGUF
Model size
0.1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support