|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- mlx |
|
|
- vision |
|
|
- multimodal |
|
|
base_model: janhq/Jan-v2-VL-low |
|
|
--- |
|
|
|
|
|
# Jan-v2-VL-low 4-bit MLX |
|
|
|
|
|
This is a 4-bit quantized MLX conversion of [janhq/Jan-v2-VL-low](https://huggingface.co/janhq/Jan-v2-VL-low). |
|
|
|
|
|
## Model Description |
|
|
|
|
|
Jan-v2-VL is an 8-billion parameter vision-language model designed for long-horizon, multi-step tasks in real software environments. This "low" variant is optimized for faster inference while maintaining strong performance on agentic automation and UI control tasks. |
|
|
|
|
|
**Key Features:** |
|
|
- Vision-language understanding for browser and desktop applications |
|
|
- Screenshot grounding and tool call capabilities |
|
|
- Stable multi-step execution with minimal performance drift |
|
|
- Error recovery and intermediate state maintenance |
|
|
|
|
|
## Quantization |
|
|
|
|
|
This model was converted to MLX format with 4-bit quantization using [MLX-VLM](https://github.com/Blaizzy/mlx-vlm) by Prince Canuma. |
|
|
|
|
|
**Conversion command:** |
|
|
```bash |
|
|
mlx_vlm.convert --hf-path janhq/Jan-v2-VL-low --quantize --q-bits 4 --mlx-path Jan-v2-VL-low-4bit-mlx |
|
|
``` |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install mlx-vlm |
|
|
``` |
|
|
|
|
|
### Python |
|
|
|
|
|
```python |
|
|
from mlx_vlm import load, generate |
|
|
from mlx_vlm.prompt_utils import apply_chat_template |
|
|
from mlx_vlm.utils import load_config |
|
|
|
|
|
# Load the model |
|
|
model_path = "mlx-community/Jan-v2-VL-low-4bit-mlx" |
|
|
model, processor = load(model_path) |
|
|
config = load_config(model_path) |
|
|
|
|
|
# Prepare input |
|
|
image = ["path/to/image.jpg"] |
|
|
prompt = "Describe this image." |
|
|
|
|
|
# Apply chat template |
|
|
formatted_prompt = apply_chat_template( |
|
|
processor, config, prompt, num_images=len(image) |
|
|
) |
|
|
|
|
|
# Generate output |
|
|
output = generate(model, processor, formatted_prompt, image, verbose=False) |
|
|
print(output) |
|
|
``` |
|
|
|
|
|
### Command Line |
|
|
|
|
|
```bash |
|
|
mlx_vlm.generate --model mlx-community/Jan-v2-VL-low-4bit-mlx --max-tokens 100 --prompt "Describe this image" --image path/to/image.jpg |
|
|
``` |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
This model is designed for: |
|
|
- Agentic automation and UI control |
|
|
- Stepwise operation in browsers and desktop applications |
|
|
- Screenshot grounding and tool calls |
|
|
- Long-horizon multi-step task execution |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the Apache 2.0 license. |
|
|
|
|
|
## Original Model |
|
|
|
|
|
For more information, please refer to the original model: [janhq/Jan-v2-VL-low](https://huggingface.co/janhq/Jan-v2-VL-low) |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- Original model by [Jan](https://huggingface.co/janhq) |
|
|
- [MLX](https://github.com/ml-explore/mlx) framework by Apple |
|
|
- MLX conversion framework by [Prince Canuma](https://github.com/Blaizzy/mlx-vlm) |
|
|
- Model conversion by [Incept5](https://incept5.ai) |
|
|
|