Model Card **

A lightweight Qwen2.5-0.5B model fine-tuned using Unsloth + LoRA (PEFT) for efficient text-generation tasks. This model is optimized for low-VRAM systems, fast inference, and rapid experimentation.


Model Details

Model Description

This model is a parameter-efficient fine-tuned version of the base model:

  • Base model: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit
  • Fine-tuning method: LoRA (PEFT)
  • Quantization: 4-bit (bnb-4bit)
  • Pipeline: text-generation
  • Library: PEFT, Transformers, TRL, Unsloth

It is intended as a compact research model for text generation, instruction following, and as a baseline for custom SFT/RLHF projects.

  • Developer: @Sriramdayal
  • Repository: https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1
  • License: Same as Qwen2.5 base license (typically Apache 2.0 or base model license)
  • Languages: English (primary), multilingual capability inherited from Qwen2.5
  • Finetuned from: unsloth/qwen2.5-0.5b-unsloth-bnb-4bit

Model Sources


Uses

Direct Use

  • Instruction-style text generation
  • Chatbot prototyping
  • Educational or research experiments
  • Low-VRAM inference (4–6 GB GPU)
  • Fine-tuning starter model for custom tasks

Downstream Use

  • Domain-specific SFT
  • Dataset distillation
  • RLHF training
  • Task-specific adapters (classifiers, generators, reasoning tasks)

Out-of-Scope / Avoid

  • High-accuracy medical/legal decisions
  • Safety-critical systems
  • Long-context reasoning competitive with large LLMs
  • Harmful or malicious use cases

Bias, Risks & Limitations

This model inherits all biases from Qwen2.5 training data and may generate:

  • Inaccurate or hallucinated information
  • Social, demographic, or political biases
  • Unsafe or harmful recommendations if misused

Recommendations

Users must implement:

  • Output filtering
  • Safety moderation
  • Human verification for critical tasks

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from peft import PeftModel

base = "unsloth/qwen2.5-0.5b-unsloth-bnb-4bit"
adapter = "black279/Qwen_LeetCoder"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, adapter)

inputs = tokenizer("Hello!", return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was trained using custom datasets prepared through:

  • Instruction datasets
  • Synthetic Q&A
  • Formatting for chat templates

(Replace with your actual dataset if you want more accuracy.)

Training Procedure

  • Framework: Unsloth + TRL + PEFT

  • Training type: Supervised Fine-Tuning (SFT)

  • Precision: bnb-4bit quantization during training

  • LoRA Ranks: (insert your actual values if different)

    • r=16, alpha=32, dropout=0.05

Hyperparameters

  • Batch size: 2–8 (depending on VRAM)
  • Gradient Accumulation: 8–16
  • LR: 2e-4
  • Epochs: 1–3
  • Optimizer: AdamW / paged optimizers (Unsloth)

Speeds & Compute

  • Hardware: 1× RTX 4090 / A100 / local GPU
  • Training Time: 1–3 hours (approx)
  • Checkpoint Size: Tiny (LoRA weights only)

Evaluation

(You can update this later after running eval benchmarks.)

  • Model evaluated on small reasoning + text-generation samples
  • Performs well for short instructions
  • Limited long-context and deep reasoning

Environmental Impact

  • Hardware: 1 GPU (consumer or cloud)
  • Carbon estimate: Low (small model + LoRA)

Technical Specs

  • Architecture: Qwen2.5 0.5B
  • Objective: Causal LM
  • Adapters: LoRA (PEFT)
  • Quantization: bnb 4-bit

Citation

@misc{Sriramdayal2025QwenLoRA,
  title={Qwen2.5-0.5B Unsloth LoRA Fine-Tune},
  author={Sriram Dayal},
  year={2025},
  howpublished={\url{https://github.com/Sriramdayal/Unsloth-LLM-finetuningv1}},
}

Model Card Author

@Sriramdayal


Framework versions

  • PEFT 0.18.0
Downloads last month
6
Safetensors
Model size
0.5B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support