Instructions to use jinaai/jina-vlm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jinaai/jina-vlm with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="jinaai/jina-vlm", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("jinaai/jina-vlm", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use jinaai/jina-vlm with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jinaai/jina-vlm"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jinaai/jina-vlm",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/jinaai/jina-vlm

SGLang

How to use jinaai/jina-vlm with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "jinaai/jina-vlm" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jinaai/jina-vlm",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "jinaai/jina-vlm" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jinaai/jina-vlm",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use jinaai/jina-vlm with Docker Model Runner:
```
docker model run hf.co/jinaai/jina-vlm
```

vLLM support and ONNX models

by Napron - opened Dec 21, 2025

Discussion

Napron

Dec 21, 2025

Thanks for a great model.

When vLLM support arrives?
Can we serve jina-VLM model with jina-serve?
Are you going to share ONNX models for vision and language models?

Napron changed discussion title from vLLM support to vLLM support and ONNX models Dec 21, 2025

gmastrapas

Jina AI org Dec 22, 2025

Hey @Napron , vLLM is coming very soon, we are working on this right now. For ONNX you mean separate models for vision and language?

Napron

Dec 23, 2025

Thats great! Yes I want to test the inference of vision and language models, would be good to have quantized onnx models too.

Thanks in advance.

grozatech

Jan 13

Hey @Napron , vLLM is coming very soon, we are working on this right now. For ONNX you mean separate models for vision and language?

Hi, thanks for such a great model. Are there any updates on vLLM serving?

Napron

Jan 13

I feel like they are very slow on adding vLLM

gmastrapas

Jina AI org Jan 13

Hi @grozatech , its taking a bit longer than expected, but I am back at it now. Will update this thread when its ready, thanks ✌️

gmastrapas

Jina AI org Jan 27

Hi again, sorry for the delay, it should work now 👉 https://huggingface.co/jinaai/jina-vlm#using-vllm

Napron

Jan 27

Hi @gmastparas, thank you for updating. Could you also provide pre-quantized AWQ or GPTQ weights for jina-vlm?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment