Instructions to use agentlans/Llama3-ja with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use agentlans/Llama3-ja with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="agentlans/Llama3-ja") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("agentlans/Llama3-ja") model = AutoModelForCausalLM.from_pretrained("agentlans/Llama3-ja") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use agentlans/Llama3-ja with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "agentlans/Llama3-ja" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "agentlans/Llama3-ja", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/agentlans/Llama3-ja
- SGLang
How to use agentlans/Llama3-ja with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "agentlans/Llama3-ja" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "agentlans/Llama3-ja", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "agentlans/Llama3-ja" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "agentlans/Llama3-ja", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use agentlans/Llama3-ja with Docker Model Runner:
docker model run hf.co/agentlans/Llama3-ja
Llama3-ja
English
Model Details
This model is a linear merge of multiple Llama 3 8B models fine-tuned for Japanese language tasks, created using mergekit. The aim is to create a more robust and versatile Japanese language model that leverages the strengths of each individual model.
Intended Use
This model is designed for various Japanese natural language processing tasks, including but not limited to:
- Text generation
- Conversation and chatbot applications
- Text completion
- Question answering
- Summarization
Limitations
While this model combines multiple Japanese-focused Llama 3 models, it may still have limitations:
- Performance on specific tasks may vary
- The model may inherit biases from its constituent models
Included models
By combining these models, we aim to create a more robust and versatile Japanese language model that leverages the strengths of each individual model.
- elyza/Llama-3-ELYZA-JP-8B
- rinna/llama-3-youko-8b-instruct
- lightblue/suzume-llama-3-8B-japanese
- neoai-inc/Llama-3-neoAI-8B-Chat-v0.1
- AXCXEPT/Llama-3-EZO-8b-Common-it
- tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1
- alfredplpl/Llama-3-8B-Instruct-Ja
- haqishen/Llama-3-8B-Japanese-Instruct
- owner203/japanese-llama-3-8b-instruct-v2
- shisa-ai/shisa-v1-llama3-8b
Acknowledgements
Thank you to the creators and contributors of all the component models for their valuable work in advancing Japanese language AI capabilities.
Japanese
モデル詳細
このモデルは、複数のLlama 3 8Bモデルの線形マージで、日本語タスクにファインチューニングされたものです。mergekitを使用して作成されました。 各個別モデルの強みを活用し、より堅牢かつ多様な日本語言学モデルを作成することを目指しています。
目的利用
このモデルは、以下のような日本語自然言語処理タスク向けに設計されていますが、これらに限定されません:
- テキスト生成
- 会話やチャットボットアプリケーション
- テキスト補完
- 質問応答
- 要約
制限事項
このモデルは、複数の日本語フォーカスしたLlama 3モデルの組み合わせですが、それでも制限があります:
- 特定のタスクでのパフォーマンスが異なる場合がある
- モデルは構成要素から偏見を受け継ぐ可能性あり
含まれるモデル
これらのモデルを組み合わせることで、各個別モデルの強みを活用し、より堅牢かつ多様な日本語言学モデルを作成します。
- elyza/Llama-3-ELYZA-JP-8B
- rinna/llama-3-youko-8b-instruct
- lightblue/suzume-llama-3-8B-japanese
- neoai-inc/Llama-3-neoAI-8B-Chat-v0.1
- AXCXEPT/Llama-3-EZO-8b-Common-it
- tokyotech-llm/Llama-3-Swallow-8B-Instruct-v0.1
- alfredplpl/Llama-3-8B-Instruct-Ja
- haqishen/Llama-3-8B-Japanese-Instruct
- owner203/japanese-llama-3-8b-instruct-v2
- shisa-ai/shisa-v1-llama3-8b
謝辞
すべてのコンポーネントモデルの創作者と貢献者に感謝いたします。彼らの価値ある仕事により、日本語AI能力が進歩しました。
- Downloads last month
- 2