GitHub Actions commited on
Commit
0d29386
·
1 Parent(s): 2897c5e

🚀 Deploy embedder from GitHub Actions - 2025-10-27 21:30:33

Browse files
Files changed (1) hide show
  1. README.md +0 -276
README.md DELETED
@@ -1,276 +0,0 @@
1
- ---
2
- title: MobileCLIP2 Embedder
3
- emoji: 🖼️
4
- colorFrom: blue
5
- colorTo: purple
6
- sdk: docker
7
- app_port: 7860
8
- ---
9
-
10
- # MobileCLIP2-S2 Embedding Service
11
-
12
- PyTorch-based FastAPI service for generating 512-dimensional image embeddings using Apple's MobileCLIP2-S2.
13
-
14
- ## Features
15
-
16
- - **Fast**: PyTorch inference with CPU/GPU support
17
- - **Production Ready**: No ONNX conversion needed
18
- - **Batch Processing**: Up to 10 images per request
19
- - **RESTful API**: Simple HTTP endpoints
20
-
21
- ## API Usage
22
-
23
- ### Single Image
24
-
25
- ```bash
26
- curl -X POST "https://YOUR_SPACE_URL/embed" \
27
28
- ```
29
-
30
- **Response:**
31
- ```json
32
- {
33
- "embedding": [0.123, -0.456, ...], // 512 floats
34
- "model": "MobileCLIP-S2",
35
- "inference_time_ms": 123.45
36
- }
37
- ```
38
-
39
- ### Batch Processing
40
-
41
- ```bash
42
- curl -X POST "https://YOUR_SPACE_URL/embed/batch" \
43
44
45
- ```
46
-
47
- **Response:**
48
- ```json
49
- {
50
- "embeddings": [[0.123, ...], [0.456, ...]],
51
- "count": 2,
52
- "total_time_ms": 234.56,
53
- "model": "MobileCLIP-S2"
54
- }
55
- ```
56
-
57
- ### Health Check
58
-
59
- ```bash
60
- curl "https://YOUR_SPACE_URL/"
61
- ```
62
-
63
- **Response:**
64
- ```json
65
- {
66
- "status": "healthy",
67
- "model": "MobileCLIP-S2",
68
- "device": "cpu",
69
- "onnx_optimized": true
70
- }
71
- ```
72
-
73
- ### Model Info
74
-
75
- ```bash
76
- curl "https://YOUR_SPACE_URL/info"
77
- ```
78
-
79
- **Response:**
80
- ```json
81
- {
82
- "model": "MobileCLIP-S2",
83
- "embedding_dim": 512,
84
- "onnx_optimized": true,
85
- "max_image_size_mb": 10,
86
- "max_batch_size": 10,
87
- "image_size": 256
88
- }
89
- ```
90
-
91
- ## Model Details
92
-
93
- - **Model**: MobileCLIP2-S2 (Apple)
94
- - **Paper**: [MobileCLIP2: Improving Multi-Modal Reinforced Training](http://arxiv.org/abs/2508.20691)
95
- - **Embedding Dimension**: 512
96
- - **Input Size**: 256×256
97
- - **Optimization**: ONNX Runtime CPU
98
- - **Normalization**: L2 normalized outputs
99
-
100
- ## Local Development
101
-
102
- ### Prerequisites
103
-
104
- - Python 3.11+
105
- - Docker & Docker Compose (optional)
106
-
107
- ### Setup
108
-
109
- 1. **Install dependencies for model conversion:**
110
-
111
- ```bash
112
- cd huggingface_embedder
113
- pip install torch open_clip_torch ml-mobileclip
114
- ```
115
-
116
- 2. **Convert model to ONNX (one-time):**
117
-
118
- ```bash
119
- python model_converter.py --output models
120
- ```
121
-
122
- This will create:
123
- - `models/mobileclip_s2_visual.onnx` (ONNX model)
124
- - `models/preprocess_config.json` (preprocessing config)
125
-
126
- 3. **Install runtime dependencies:**
127
-
128
- ```bash
129
- pip install -r requirements.txt
130
- ```
131
-
132
- 4. **Run locally:**
133
-
134
- ```bash
135
- uvicorn embedder:app --reload --port 7860
136
- ```
137
-
138
- 5. **Test the API:**
139
-
140
- ```bash
141
- # Health check
142
- curl http://localhost:7860/
143
-
144
- # Generate embedding
145
- curl -X POST http://localhost:7860/embed \
146
- -F "file=@test_image.jpg"
147
- ```
148
-
149
- ### Docker
150
-
151
- ```bash
152
- # Build and run
153
- docker compose up
154
-
155
- # Test
156
- curl -X POST http://localhost:8001/embed \
157
- -F "file=@test_image.jpg"
158
- ```
159
-
160
- ## HuggingFace Spaces Deployment
161
-
162
- ### Initial Setup
163
-
164
- 1. **Create new Space:**
165
- - Go to https://huggingface.co/spaces
166
- - Click "Create new Space"
167
- - Select **Docker** as SDK
168
- - Set app_port to **7860**
169
-
170
- 2. **Add GitHub Secret:**
171
- - Go to your GitHub repo Settings → Secrets
172
- - Add `HUGGINGFACE_ACCESS_TOKEN` with your HF token
173
-
174
- 3. **Deploy:**
175
-
176
- ```bash
177
- # Just push to main branch!
178
- git push origin main
179
- ```
180
-
181
- **That's it!** The model will be automatically downloaded from HuggingFace Hub (`apple/MobileCLIP-S2`) and converted to ONNX during the Docker build.
182
-
183
- The Space will automatically build and deploy (takes 5-10 minutes for first build).
184
-
185
- ### Using GitHub Actions for Sync
186
-
187
- See [Managing Spaces with GitHub Actions](https://huggingface.co/docs/hub/spaces-github-actions) for automatic sync from your GitHub repo.
188
-
189
- ## Performance
190
-
191
- ### Metrics (CPU: 2 cores, 2GB RAM)
192
-
193
- - **Single Inference**: ~100-200ms
194
- - **Batch (10 images)**: ~800-1200ms
195
- - **Memory Usage**: <1.5GB
196
- - **Throughput**: ~6-10 images/second
197
-
198
- ### Memory Optimization
199
-
200
- The ONNX model uses ~50-70% less RAM compared to PyTorch:
201
-
202
- - **PyTorch**: ~2.5GB RAM
203
- - **ONNX (FP32)**: ~800MB RAM
204
- - **ONNX (INT8)**: ~400MB RAM (use `--quantize` flag)
205
-
206
- ## Error Handling
207
-
208
- | Status | Description |
209
- |--------|-------------|
210
- | 200 | Success |
211
- | 400 | Invalid file type or format |
212
- | 413 | File too large (>10MB) |
213
- | 500 | Inference error |
214
-
215
- ## Limitations
216
-
217
- - **Max image size**: 10MB per file
218
- - **Max batch size**: 10 images per request
219
- - **Supported formats**: JPEG, PNG, WebP
220
- - **No GPU**: CPU-only inference (sufficient for most use cases)
221
-
222
- ## Integration Example
223
-
224
- ### Python
225
-
226
- ```python
227
- import requests
228
-
229
- # Single image
230
- with open("photo.jpg", "rb") as f:
231
- response = requests.post(
232
- "https://YOUR_SPACE_URL/embed",
233
- files={"file": f}
234
- )
235
-
236
- embedding = response.json()["embedding"]
237
- print(f"Embedding shape: {len(embedding)}") # 512
238
- ```
239
-
240
- ### JavaScript
241
-
242
- ```javascript
243
- const formData = new FormData();
244
- formData.append('file', imageFile);
245
-
246
- const response = await fetch('https://YOUR_SPACE_URL/embed', {
247
- method: 'POST',
248
- body: formData
249
- });
250
-
251
- const data = await response.json();
252
- console.log('Embedding:', data.embedding);
253
- ```
254
-
255
- ## License
256
-
257
- - **Code**: MIT License
258
- - **Model**: [Apple AMLR License](https://huggingface.co/apple/MobileCLIP-S2)
259
-
260
- ## Citation
261
-
262
- ```bibtex
263
- @article{mobileclip2,
264
- title={MobileCLIP2: Improving Multi-Modal Reinforced Training},
265
- author={Faghri, Fartash and Vasu, Pavan Kumar Anasosalu and Koc, Cem and Shankar, Vaishaal and Toshev, Alexander T and Tuzel, Oncel and Pouransari, Hadi},
266
- journal={Transactions on Machine Learning Research},
267
- year={2025}
268
- }
269
- ```
270
-
271
- ## Support
272
-
273
- For issues or questions:
274
- - HuggingFace Spaces: https://huggingface.co/docs/hub/spaces
275
- - Model: https://huggingface.co/apple/MobileCLIP-S2
276
- - ONNX Runtime: https://onnxruntime.ai/