Apuri-V3-0.6B-SFT-QAT

Finnish instruction-following model based on Qwen3-0.6B.

Model Details

  • Base: Qwen/Qwen3-0.6B
  • Training: CPT (Finnish) → SFT with QAT
  • Languages: Finnish, English
  • Quantization: Q4_K_M with imatrix (379MB)

Usage with Ollama

Create a Modelfile with the chat template:

FROM ./apuri-v3-0.6b-sft-qat-Q4_K_M.gguf

TEMPLATE """{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
"""

PARAMETER temperature 0.6
PARAMETER stop <|im_end|>
PARAMETER num_ctx 2048

Then:

ollama create apuri-finnish -f Modelfile
ollama run apuri-finnish

Usage with llama-cpp-python

from llama_cpp import Llama

llm = Llama(
    model_path="apuri-v3-0.6b-sft-qat-Q4_K_M.gguf",
    n_ctx=2048,
    chat_format="chatml",  # Important: use chatml format
    n_gpu_layers=-1,
)

response = llm.create_chat_completion(
    messages=[{"role": "user", "content": "Mikä on Suomen pääkaupunki?"}],
    max_tokens=100,
    temperature=0.6,
)
print(response["choices"][0]["message"]["content"])
# Output: Suomen pääkaupunki on Helsinki...

Test Results

Test Result
Mikä on Suomen pääkaupunki? Helsinki ✅
Mikä on 15 + 27? 42 ✅
Listaa kolme Suomen kaupunkia Helsinki, Turku, Tampere ✅
3 omenaa + 5 lisää = ? 8 omenaa ✅

Files

  • apuri-v3-0.6b-sft-qat-Q4_K_M.gguf - Q4_K_M quantized (379MB) - Recommended
  • apuri-v3-0.6b-sft-qat-bf16.gguf - BF16 full precision (1.2GB)

Training

  • CPT: 1M Finnish tokens (fineweb-edu, mc4-fi, news)
  • SFT: 30K instructions (OpenHermes + Finnish OASST2/Capybara)
  • QAT: int4 quantization-aware training

Limitations

  • Small 0.6B model - best for simple tasks
  • May have knowledge gaps in Finnish culture/history
  • Works best with the chat template shown above
Downloads last month
4
GGUF
Model size
0.6B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

4-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Adanmohh/apuri-v3-0.6b-sft-qat-gguf

Finetuned
Qwen/Qwen3-0.6B
Quantized
(287)
this model