Spaces:

Synthesius
/

Cloudzy

Sleeping

App Files Files Community

Cloudzy / README.md

GitHub Actions

🚀 Deploy embedder from GitHub Actions - 2025-10-27 22:54:05

3e8073f about 2 months ago

preview code

raw

history blame

5.51 kB

metadata

title: MobileCLIP2 Embedder
emoji: 🖼️
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860

MobileCLIP2-S2 Embedding Service

ONNX-optimized FastAPI service for generating 512-dimensional image embeddings using Apple's MobileCLIP2-S2.

Features

Fast: ONNX Runtime CPU optimizations
Memory Efficient: <2GB RAM footprint
Batch Processing: Up to 10 images per request
RESTful API: Simple HTTP endpoints

API Usage

Single Image

curl -X POST "https://YOUR_SPACE_URL/embed" \
  -F "[email protected]"

Response:

{
  "embedding": [0.123, -0.456, ...],  // 512 floats
  "model": "MobileCLIP-S2",
  "inference_time_ms": 123.45
}

Batch Processing

curl -X POST "https://YOUR_SPACE_URL/embed/batch" \
  -F "[email protected]" \
  -F "[email protected]"

Response:

{
  "embeddings": [[0.123, ...], [0.456, ...]],
  "count": 2,
  "total_time_ms": 234.56,
  "model": "MobileCLIP-S2"
}

Health Check

curl "https://YOUR_SPACE_URL/"

Response:

{
  "status": "healthy",
  "model": "MobileCLIP-S2",
  "device": "cpu",
  "onnx_optimized": true
}

Model Info

curl "https://YOUR_SPACE_URL/info"

Response:

{
  "model": "MobileCLIP-S2",
  "embedding_dim": 512,
  "onnx_optimized": true,
  "max_image_size_mb": 10,
  "max_batch_size": 10,
  "image_size": 256
}

Model Details

Model: MobileCLIP2-S2 (Apple)
Paper: MobileCLIP2: Improving Multi-Modal Reinforced Training
Embedding Dimension: 512
Input Size: 256×256
Optimization: ONNX Runtime CPU
Normalization: L2 normalized outputs

Local Development

Prerequisites

Python 3.11+
Docker & Docker Compose (optional)

Setup

Install dependencies for model conversion:

cd huggingface_embedder
pip install torch open_clip_torch ml-mobileclip

Convert model to ONNX (one-time):

python model_converter.py --output models

This will create:

models/mobileclip_s2_visual.onnx (ONNX model)
models/preprocess_config.json (preprocessing config)

Install runtime dependencies:

pip install -r requirements.txt

Run locally:

uvicorn embedder:app --reload --port 7860

Test the API:

# Health check
curl http://localhost:7860/

# Generate embedding
curl -X POST http://localhost:7860/embed \
  -F "file=@test_image.jpg"

Docker

# Build and run
docker compose up

# Test
curl -X POST http://localhost:8001/embed \
  -F "file=@test_image.jpg"

HuggingFace Spaces Deployment

Initial Setup

Create new Space:
- Go to https://huggingface.co/spaces
- Click "Create new Space"
- Select Docker as SDK
- Set app_port to 7860
Add GitHub Secret:
- Go to your GitHub repo Settings → Secrets
- Add HUGGINGFACE_ACCESS_TOKEN with your HF token
Deploy:

# Just push to main branch!
git push origin main

That's it! The model will be automatically downloaded from HuggingFace Hub (apple/MobileCLIP-S2) and converted to ONNX during the Docker build.

The Space will automatically build and deploy (takes 5-10 minutes for first build).

Using GitHub Actions for Sync

See Managing Spaces with GitHub Actions for automatic sync from your GitHub repo.

Performance

Metrics (CPU: 2 cores, 2GB RAM)

Single Inference: ~100-200ms
Batch (10 images): ~800-1200ms
Memory Usage: <1.5GB
Throughput: ~6-10 images/second

Memory Optimization

The ONNX model uses ~50-70% less RAM compared to PyTorch:

PyTorch: ~2.5GB RAM
ONNX (FP32): ~800MB RAM
ONNX (INT8): ~400MB RAM (use --quantize flag)

Error Handling

Status	Description
200	Success
400	Invalid file type or format
413	File too large (>10MB)
500	Inference error

Limitations

Max image size: 10MB per file
Max batch size: 10 images per request
Supported formats: JPEG, PNG, WebP
No GPU: CPU-only inference (sufficient for most use cases)

Integration Example

Python

import requests

# Single image
with open("photo.jpg", "rb") as f:
    response = requests.post(
        "https://YOUR_SPACE_URL/embed",
        files={"file": f}
    )
    
embedding = response.json()["embedding"]
print(f"Embedding shape: {len(embedding)}")  # 512

JavaScript

const formData = new FormData();
formData.append('file', imageFile);

const response = await fetch('https://YOUR_SPACE_URL/embed', {
  method: 'POST',
  body: formData
});

const data = await response.json();
console.log('Embedding:', data.embedding);

License

Code: MIT License
Model: Apple AMLR License

Citation

@article{mobileclip2,
  title={MobileCLIP2: Improving Multi-Modal Reinforced Training},
  author={Faghri, Fartash and Vasu, Pavan Kumar Anasosalu and Koc, Cem and Shankar, Vaishaal and Toshev, Alexander T and Tuzel, Oncel and Pouransari, Hadi},
  journal={Transactions on Machine Learning Research},
  year={2025}
}

Support

For issues or questions:

HuggingFace Spaces: https://huggingface.co/docs/hub/spaces
Model: https://huggingface.co/apple/MobileCLIP-S2
ONNX Runtime: https://onnxruntime.ai/