Instructions to use prithivMLmods/Llama-3.2-6B-AlgoCode with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prithivMLmods/Llama-3.2-6B-AlgoCode with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="prithivMLmods/Llama-3.2-6B-AlgoCode")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("prithivMLmods/Llama-3.2-6B-AlgoCode")
model = AutoModelForCausalLM.from_pretrained("prithivMLmods/Llama-3.2-6B-AlgoCode")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use prithivMLmods/Llama-3.2-6B-AlgoCode with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prithivMLmods/Llama-3.2-6B-AlgoCode"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Llama-3.2-6B-AlgoCode",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/prithivMLmods/Llama-3.2-6B-AlgoCode

SGLang

How to use prithivMLmods/Llama-3.2-6B-AlgoCode with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prithivMLmods/Llama-3.2-6B-AlgoCode" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Llama-3.2-6B-AlgoCode",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prithivMLmods/Llama-3.2-6B-AlgoCode" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Llama-3.2-6B-AlgoCode",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use prithivMLmods/Llama-3.2-6B-AlgoCode with Docker Model Runner:
```
docker model run hf.co/prithivMLmods/Llama-3.2-6B-AlgoCode
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Llama-3.2-6B-AlgoCode

Llama-3.2-6B-AlgoCode is a collection of code-centric, multilingual large language models (LLMs) designed for text generation tasks involving algorithms and coding use cases. Available in both 1B and 3B parameter sizes, these models are pretrained and instruction-tuned for diverse generative tasks, particularly optimized for multilingual dialogue, agentic retrieval, and summarization.

Key Features

Multilingual Support: The models are optimized for generating text in multiple languages, making them ideal for multilingual coding environments.
Instruction-Tuned: Specially fine-tuned for instruction-following tasks to improve accuracy in complex generative workflows.
Text-Only Models: Focused entirely on text input and output, suitable for code generation, algorithmic problem-solving, summarization, and retrieval tasks.
Agentic Retrieval: Performs well in scenarios requiring retrieval-based responses and summarization of external knowledge.

Intended Use

Llama-3.2-6B-AlgoCode can be integrated using the Hugging Face transformers library for various text generation tasks:

Example Usage

import torch
from transformers import pipeline

# Model ID from Hugging Face
model_id = "prithivMLmods/Llama-3.2-6B-AlgoCode"

# Initialize pipeline for text generation
pipe = pipeline(
    "text-generation", 
    model=model_id, 
    torch_dtype=torch.bfloat16, 
    device_map="auto"
)

# Generate text
response = pipe("The key to life is")
print(response[0]['generated_text'])

Limitations

1. Bias and Fairness

Despite extensive training and alignment efforts, the model may still reflect biases inherent in the data it was trained on. Users should critically evaluate outputs, particularly in sensitive or high-impact contexts.

2. Contextual Understanding

While generally robust, the model may misinterpret complex or ambiguous prompts, resulting in inaccurate or irrelevant responses.

3. Real-Time Knowledge

The model’s knowledge is static, based on the data available during training. It does not include real-time information or updates on recent events and developments.

4. Safety and Harmlessness

Although the model is aligned with safety guidelines, there is a possibility of inappropriate or harmful outputs in certain contexts. It is recommended to employ human oversight and continuous monitoring when deploying the model in sensitive applications.

5. Resource Requirements

Running Llama-3.2-6B-AlgoCode efficiently requires substantial computational resources, especially for real-time or large-scale deployments. Leveraging GPUs with sufficient memory (16GB+) is recommended for optimal performance.

6. Ethical Considerations

Users must adhere to ethical guidelines when deploying this model. It should not be used for:

Generating harmful or malicious content
Spreading misinformation or spam
Any form of unethical activity

7. Domain-Specific Limitations

While the model excels in general-purpose text generation, it may require further fine-tuning for niche or highly specialized fields such as:

Medical
Legal
Financial

Downloads last month: 9

Safetensors

Model size

6B params

Tensor type

F16

Model tree for prithivMLmods/Llama-3.2-6B-AlgoCode

Quantizations

2 models

prithivMLmods
/

Llama-3.2-6B-AlgoCode