AI & ML interests

None defined yet.

ovi054 
posted an update 2 months ago
ovi054 
posted an update 3 months ago
view post
Post
2106
My project, Anim-Lab-AI, won the Community Choice Award at the MCP-1st-Birthday hackathon by @HuggingFace and @Gradio ! 🏆

It turns any idea or complex concept into a clear, engaging explainer animation video. 🎥

I want to thank everyone in the Hugging Face community for supporting my project!

MCP-1st-Birthday/anim-lab-ai
  • 2 replies
·
sagar007 
posted an update 3 months ago
view post
Post
4193
🚀 I built a Multimodal Vision-Language Model from using Gemma-270M + CLIP!

Just finished training my multimodal model on the full LLaVA-Instruct-150K dataset (157K samples) and wanted to share the results!

🔧 What I Built:
A vision-language model that can understand images and answer questions about them, combining:
- Google Gemma-3-270M (language)
- OpenAI CLIP ViT-Large/14 (vision)
- LoRA fine-tuning for efficiency

📊 Training Stats:
- 157,712 training samples (full LLaVA dataset)
- 3 epochs on A100 40GB
- ~9 hours training time
- Final loss: 1.333 training / 1.430 validation
- Only 18.6M trainable params (3.4% of 539M total)

📈 sagar007/multigemma
Benchmark Results:
- VQA Accuracy: 53.8%
- Works great for: animal detection, room identification, scene understanding



🔗 **Try it yourself:**
- 🤗 Model: sagar007/multigemma
- 🎮 Demo: https://huggingface.co/spaces/sagar007/Multimodal-Gemma
- 💻 GitHub: https://github.com/sagar431/multimodal-gemma-270m

Built with PyTorch Lightning + MLflow for experiment tracking. Full MLOps pipeline with CI/CD!

Would love to hear your feedback! 🙏

#multimodal #gemma #clip #llava #vision-language #pytorch
  • 9 replies
·
ovi054 
posted an update 4 months ago
view post
Post
2697
Z-Image Turbo + LoRA ⚡

ovi054/Z-Image-LORA

Z-Image Turbo is the No. 1 trending Text-to-Image model right now. You can add a custom LoRA and generate images with this Space.

👉 Try it now: ovi054/Z-Image-LORA
  • 3 replies
·
ovi054 
posted an update 4 months ago
view post
Post
3400
Anim Lab AI⚡

Turn any math concept or logic into a clear video explanation instantly using AI.

This is my submission for the MCP 1st Birthday Hackathon, and it’s already crossed 1,000 runs.

👉 Try it now: MCP-1st-Birthday/anim-lab-ai

Demo outputs are attached 👇
ovi054 
posted an update 4 months ago
view post
Post
6159
Introducing Anim Lab AI⚡

My submission for the MCP 1st Birthday Hackathon

Turn any math concept or logic into a clear video explanation instantly using AI.

👉 Try it now: MCP-1st-Birthday/anim-lab-ai

Demo outputs are attached 👇
ovi054 
posted an update 6 months ago
ovi054 
posted an update 8 months ago
view post
Post
6004
Image-to-Prompt⚡

ovi054/image-to-prompt

Extract text prompt from image. And you can reuse the prompt to generate similar images!

Useful for prompt engineering, studying image-to-text alignment, making training datasets, or recreating similar outputs.

Powered by: Gradio, Florence 2

👉 Try it now: ovi054/image-to-prompt
  • 3 replies
·
ovi054 
posted an update 8 months ago
view post
Post
4542
Update on https://huggingface.co/spaces/ovi054/Qwen-Image-LORA

You can now load a Qwen LoRA in this space as follows:

1. Model ID:
flymy-ai/qwen-image-realism-lora

2. Model link:
https://huggingface.co/flymy-ai/qwen-image-realism-lora

3. Specific file link:
https://huggingface.co/flymy-ai/qwen-image-realism-lora/blob/main/flymy_realism.safetensors

4. Direct download link:
https://huggingface.co/flymy-ai/qwen-image-realism-lora/resolve/main/flymy_realism.safetensors

You can also use an external .safetensors download link (if Hugging Face doesn’t block it).

It is useful if a model repository contains multiple weights and you want to load a specific one.

👉 Try it now: https://huggingface.co/spaces/ovi054/Qwen-Image-LORA
ovi054 
posted an update 8 months ago
view post
Post
3792
WAN 2.2 Text to Image ⚡

ovi054/wan2-2-text-to-image

We all know that WAN 2.2 A14B is a video model. But It turns out this video model can also produce great image results with incredible prompt adherence! The image output is sharp, detailed, and sticks to the prompt better than most.

👉 Try it now: ovi054/wan2-2-text-to-image
  • 1 reply
·
ovi054 
posted an update 8 months ago
reach-vb 
updated a Space 11 months ago