Cloudzy / README.md
GitHub Actions
๐Ÿš€ Deploy embedder from GitHub Actions - 2025-10-27 22:54:05
3e8073f
|
raw
history blame
5.51 kB
metadata
title: MobileCLIP2 Embedder
emoji: ๐Ÿ–ผ๏ธ
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860

MobileCLIP2-S2 Embedding Service

ONNX-optimized FastAPI service for generating 512-dimensional image embeddings using Apple's MobileCLIP2-S2.

Features

  • Fast: ONNX Runtime CPU optimizations
  • Memory Efficient: <2GB RAM footprint
  • Batch Processing: Up to 10 images per request
  • RESTful API: Simple HTTP endpoints

API Usage

Single Image

curl -X POST "https://YOUR_SPACE_URL/embed" \
  -F "[email protected]"

Response:

{
  "embedding": [0.123, -0.456, ...],  // 512 floats
  "model": "MobileCLIP-S2",
  "inference_time_ms": 123.45
}

Batch Processing

curl -X POST "https://YOUR_SPACE_URL/embed/batch" \
  -F "[email protected]" \
  -F "[email protected]"

Response:

{
  "embeddings": [[0.123, ...], [0.456, ...]],
  "count": 2,
  "total_time_ms": 234.56,
  "model": "MobileCLIP-S2"
}

Health Check

curl "https://YOUR_SPACE_URL/"

Response:

{
  "status": "healthy",
  "model": "MobileCLIP-S2",
  "device": "cpu",
  "onnx_optimized": true
}

Model Info

curl "https://YOUR_SPACE_URL/info"

Response:

{
  "model": "MobileCLIP-S2",
  "embedding_dim": 512,
  "onnx_optimized": true,
  "max_image_size_mb": 10,
  "max_batch_size": 10,
  "image_size": 256
}

Model Details

Local Development

Prerequisites

  • Python 3.11+
  • Docker & Docker Compose (optional)

Setup

  1. Install dependencies for model conversion:
cd huggingface_embedder
pip install torch open_clip_torch ml-mobileclip
  1. Convert model to ONNX (one-time):
python model_converter.py --output models

This will create:

  • models/mobileclip_s2_visual.onnx (ONNX model)
  • models/preprocess_config.json (preprocessing config)
  1. Install runtime dependencies:
pip install -r requirements.txt
  1. Run locally:
uvicorn embedder:app --reload --port 7860
  1. Test the API:
# Health check
curl http://localhost:7860/

# Generate embedding
curl -X POST http://localhost:7860/embed \
  -F "file=@test_image.jpg"

Docker

# Build and run
docker compose up

# Test
curl -X POST http://localhost:8001/embed \
  -F "file=@test_image.jpg"

HuggingFace Spaces Deployment

Initial Setup

  1. Create new Space:

  2. Add GitHub Secret:

    • Go to your GitHub repo Settings โ†’ Secrets
    • Add HUGGINGFACE_ACCESS_TOKEN with your HF token
  3. Deploy:

# Just push to main branch!
git push origin main

That's it! The model will be automatically downloaded from HuggingFace Hub (apple/MobileCLIP-S2) and converted to ONNX during the Docker build.

The Space will automatically build and deploy (takes 5-10 minutes for first build).

Using GitHub Actions for Sync

See Managing Spaces with GitHub Actions for automatic sync from your GitHub repo.

Performance

Metrics (CPU: 2 cores, 2GB RAM)

  • Single Inference: ~100-200ms
  • Batch (10 images): ~800-1200ms
  • Memory Usage: <1.5GB
  • Throughput: ~6-10 images/second

Memory Optimization

The ONNX model uses ~50-70% less RAM compared to PyTorch:

  • PyTorch: ~2.5GB RAM
  • ONNX (FP32): ~800MB RAM
  • ONNX (INT8): ~400MB RAM (use --quantize flag)

Error Handling

Status Description
200 Success
400 Invalid file type or format
413 File too large (>10MB)
500 Inference error

Limitations

  • Max image size: 10MB per file
  • Max batch size: 10 images per request
  • Supported formats: JPEG, PNG, WebP
  • No GPU: CPU-only inference (sufficient for most use cases)

Integration Example

Python

import requests

# Single image
with open("photo.jpg", "rb") as f:
    response = requests.post(
        "https://YOUR_SPACE_URL/embed",
        files={"file": f}
    )
    
embedding = response.json()["embedding"]
print(f"Embedding shape: {len(embedding)}")  # 512

JavaScript

const formData = new FormData();
formData.append('file', imageFile);

const response = await fetch('https://YOUR_SPACE_URL/embed', {
  method: 'POST',
  body: formData
});

const data = await response.json();
console.log('Embedding:', data.embedding);

License

Citation

@article{mobileclip2,
  title={MobileCLIP2: Improving Multi-Modal Reinforced Training},
  author={Faghri, Fartash and Vasu, Pavan Kumar Anasosalu and Koc, Cem and Shankar, Vaishaal and Toshev, Alexander T and Tuzel, Oncel and Pouransari, Hadi},
  journal={Transactions on Machine Learning Research},
  year={2025}
}

Support

For issues or questions: