ai-starter-pack (AI Starter Pack)

posted an update 2 months ago

Post

3341

ovi054/LTX-2-19b-Squish-LoRA ⚡

I trained a Squish LoRA for LTX-2. Upload an image and give prompt "squish it" to get the squish video.

Demo output videos are attached.

👉Try it now:
ovi054/LTX-2-19b-Squish-LoRA
ovi054/ltx-2-Audio-to-Video

ovi054

posted an update 3 months ago

Post

2106

My project, Anim-Lab-AI, won the Community Choice Award at the MCP-1st-Birthday hackathon by @HuggingFace and @Gradio ! 🏆

It turns any idea or complex concept into a clear, engaging explainer animation video. 🎥

I want to thank everyone in the Hugging Face community for supporting my project!

MCP-1st-Birthday/anim-lab-ai

2 replies

·

sagar007

posted an update 3 months ago

Post

4193

🚀 I built a Multimodal Vision-Language Model from using Gemma-270M + CLIP!

Just finished training my multimodal model on the full LLaVA-Instruct-150K dataset (157K samples) and wanted to share the results!

🔧 What I Built:
A vision-language model that can understand images and answer questions about them, combining:
- Google Gemma-3-270M (language)
- OpenAI CLIP ViT-Large/14 (vision)
- LoRA fine-tuning for efficiency

📊 Training Stats:
- 157,712 training samples (full LLaVA dataset)
- 3 epochs on A100 40GB
- ~9 hours training time
- Final loss: 1.333 training / 1.430 validation
- Only 18.6M trainable params (3.4% of 539M total)

📈 sagar007/multigemma
Benchmark Results:
- VQA Accuracy: 53.8%
- Works great for: animal detection, room identification, scene understanding

🔗 **Try it yourself:**
- 🤗 Model: sagar007/multigemma
- 🎮 Demo: https://huggingface.co/spaces/sagar007/Multimodal-Gemma
- 💻 GitHub: https://github.com/sagar431/multimodal-gemma-270m

Built with PyTorch Lightning + MLflow for experiment tracking. Full MLOps pipeline with CI/CD!

Would love to hear your feedback! 🙏

#multimodal #gemma #clip #llava #vision-language #pytorch

9 replies

·

ovi054

posted an update 4 months ago

Post

2697

Z-Image Turbo + LoRA ⚡

ovi054/Z-Image-LORA

Z-Image Turbo is the No. 1 trending Text-to-Image model right now. You can add a custom LoRA and generate images with this Space.

👉 Try it now: ovi054/Z-Image-LORA

3 replies

·

ovi054

posted an update 4 months ago

Post

3400

Anim Lab AI⚡

Turn any math concept or logic into a clear video explanation instantly using AI.

This is my submission for the MCP 1st Birthday Hackathon, and it’s already crossed 1,000 runs.

👉 Try it now: MCP-1st-Birthday/anim-lab-ai

Demo outputs are attached 👇

ovi054

posted an update 4 months ago

Post

6159

Introducing Anim Lab AI⚡

My submission for the MCP 1st Birthday Hackathon

Turn any math concept or logic into a clear video explanation instantly using AI.

👉 Try it now: MCP-1st-Birthday/anim-lab-ai

Demo outputs are attached 👇

ovi054

posted an update 6 months ago

Post

2452

Virtual Try-On LoRA + FLUX.1 Kontext [dev] ⚡

Model: ovi054/virtual-tryon-kontext-lora

Place the garment onto the model image as an overlay and the LoRA model will generate a realistic try-on result.

👉 Try it now: ovi054/virtual-tryon-kontext-lora

Samarth0710

authored a paper 8 months ago

Manimator: Transforming Research Papers into Visual Explanations

Paper • 2507.14306 • Published Jul 18, 2025 • 4

ovi054

posted an update 8 months ago

Post

6004

Image-to-Prompt⚡

ovi054/image-to-prompt

Extract text prompt from image. And you can reuse the prompt to generate similar images!

Useful for prompt engineering, studying image-to-text alignment, making training datasets, or recreating similar outputs.

Powered by: Gradio, Florence 2

👉 Try it now: ovi054/image-to-prompt

3 replies

·

ovi054

posted an update 8 months ago

Post

4542

Update on https://huggingface.co/spaces/ovi054/Qwen-Image-LORA ⚡

You can now load a Qwen LoRA in this space as follows:

1. Model ID:

flymy-ai/qwen-image-realism-lora

2. Model link:

https://huggingface.co/flymy-ai/qwen-image-realism-lora

3. Specific file link:

https://huggingface.co/flymy-ai/qwen-image-realism-lora/blob/main/flymy_realism.safetensors

4. Direct download link:

https://huggingface.co/flymy-ai/qwen-image-realism-lora/resolve/main/flymy_realism.safetensors

You can also use an external .safetensors download link (if Hugging Face doesn’t block it).

It is useful if a model repository contains multiple weights and you want to load a specific one.

👉 Try it now: https://huggingface.co/spaces/ovi054/Qwen-Image-LORA

ovi054

posted an update 8 months ago

Post

3792

WAN 2.2 Text to Image ⚡

ovi054/wan2-2-text-to-image

We all know that WAN 2.2 A14B is a video model. But It turns out this video model can also produce great image results with incredible prompt adherence! The image output is sharp, detailed, and sticks to the prompt better than most.

👉 Try it now: ovi054/wan2-2-text-to-image

1 reply

·

ovi054

posted an update 8 months ago

Post

2531

Qwen Image + LoRA ⚡

https://huggingface.co/spaces/ovi054/Qwen-Image-LORA

Qwen Image is the No. 1 trending Text-to-Image model right now. You can add a custom LoRA and generate images with this Space.

👉 Try it now: https://huggingface.co/spaces/ovi054/Qwen-Image-LORA