YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

SimpleLLaMA

An open, educational framework for understanding and reproducing the complete training and alignment pipeline of modern Large Language Models (LLMs).

(Can find main GH repo at https://github.com/IvanC987/SimpleLLaMA)


Overview

SimpleLLaMA is a comprehensive project designed to demystify the lifecycle of LLM development, starting from raw data to a functioning aligned model.
It provides a transparent implementation of the three main stages of language model creation:

  1. Pretraining โ€” Unsupervised training of a 1.3B-parameter transformer model on a 50B-token curated corpus.
  2. Supervised Fine-Tuning (SFT) โ€” Instruction-tuning on human-written datasets to enable task-following and conversational behavior.
  3. Reinforcement Learning from Human Feedback (RLHF) โ€” Alignment via Direct Preference Optimization (DPO) to refine model responses based on human preference data.

In addition, the project includes modules for data preparation, tokenization, evaluation, and deployment, enabling users to experiment with every major step of the modern LLM pipeline.


Key Features

  • Full LLM Training Lifecycle: Covers pretraining โ†’ SFT โ†’ DPO alignment in one unified framework.
  • Scalable Transformer Architecture: Implements a 1.3B parameter model inspired by LLaMA, trained efficiently on 50B tokens.
  • Alignment Techniques: Integrates full Fine-Tuning (possibly LoRA later on) and DPO for behavioral training and preference optimization.
  • Evaluation Framework: Benchmarked on common understanding benchmarks including MMLU, HellaSwag, ARC, PIQA.
  • Deployment Ready: Includes inference utilities for text generation and context management.
  • Documentation Site: Fully documented with architecture breakdowns, training logs, configurations, and detailed walkthroughes of the entire repository.

Getting Started

To play with the model:

Clone the repository:

git clone https://github.com/IvanC987/SimpleLLaMA
cd SimpleLLaMA
pip install -r requirements.txt
pip install -e .

(More to be added later here once completed)

If you wish to run custom pretraining, fine-tuning, or reinforcement learning, please refer to the Custom Training section in the SimpleLLaMA Documentations page


Documentation & Technical Report

For an in-depth look into the architecture, experiments, and training methodology, visit the full documentation:

๐Ÿ“˜ Documentation: https://ivanc987.github.io/SimpleLLaMA/
๐Ÿ“„ Technical Report: Technical_Report.md


Benchmarks

Dataset Metric Score
MMLU Accuracy XX.X%
ARC (Challenge) Accuracy XX.X%
ARC (Easy) Accuracy XX.X%
HellaSwag Accuracy XX.X%
PIQA Accuracy XX.X%

(See the Misc/Benchmarking section in documentations for more details)


License

This project is licensed under the MIT License.
Feel free to use, extend, or adapt it for research or application purposes.


Author

Ivan Cao
Senior CS Student | University of Mississippi
Open to collaboration and research questions.
GitHub: https://github.com/IvanC987/


Acknowledgements

This project was inspired by LLaMA, DeepSeek, and various other open source Large Language Models

Papers:

Videos:

Datasets:

Portions of the model architecture are adapted from:

Much of the implementation also borrows design clarity from these excellent open-source efforts.


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support