Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
10.1
TFLOPS
1
2
6
Sagar pallai
PRO
sagar007
Follow
Mi6paulino's profile picture
tcarson's profile picture
John6666's profile picture
28 followers
ยท
30 following
AI & ML interests
LLM AND STABLE DIFFUSION
Recent Activity
replied
to
their
post
about 22 hours ago
๐ I built a Multimodal Vision-Language Model from using Gemma-270M + CLIP! Just finished training my multimodal model on the full LLaVA-Instruct-150K dataset (157K samples) and wanted to share the results! ๐ง What I Built: A vision-language model that can understand images and answer questions about them, combining: - Google Gemma-3-270M (language) - OpenAI CLIP ViT-Large/14 (vision) - LoRA fine-tuning for efficiency ๐ Training Stats: - 157,712 training samples (full LLaVA dataset) - 3 epochs on A100 40GB - ~9 hours training time - Final loss: 1.333 training / 1.430 validation - Only 18.6M trainable params (3.4% of 539M total) ๐ https://huggingface.co/sagar007/multigemma Benchmark Results: - VQA Accuracy: 53.8% - Works great for: animal detection, room identification, scene understanding ๐ **Try it yourself:** - ๐ค Model: https://huggingface.co/sagar007/multigemma - ๐ฎ Demo: https://huggingface.co/spaces/sagar007/Multimodal-Gemma - ๐ป GitHub: https://github.com/sagar431/multimodal-gemma-270m Built with PyTorch Lightning + MLflow for experiment tracking. Full MLOps pipeline with CI/CD! Would love to hear your feedback! ๐ #multimodal #gemma #clip #llava #vision-language #pytorch
reacted
to
unmodeled-tyler
's
post
with ๐
1 day ago
NEW MODEL: https://huggingface.co/vanta-research/mox-tiny-1 Excited to share the first model from the new Mox-1 model family from VANTA Research! Mox-Tiny-1 is finetuned from Llama's 3.1 8B architecture specifically for thinking partnership and high-fidelity AI interaction. Training domains were carefully selected with this goal in mind, and all datasets used were synthetically generated for the Mox-1 models specifically. Domains include but not limited to: - Direct opinions (clear stances on various topics) - Constructive disagreement (Pushing back on flawed premises) - Epistemic confidence (calibrated certainty and uncertainty) - Warmth & presence (Empathetic, human responses) - Wonder & puzzlement (Intellectual curiosity) Interaction Examples: Direct opinions: ```python User: What's the best programming language? Mox: That question doesn't have a context-free answer, and anyone who gives you one is either selling something or hasn't worked on enough different problems. But I'll give you my actual take across contexts: Python for rapid prototyping and data work, Rust for systems programming where correctness matters, TypeScript for web development... ``` Warmth & Humor: ```python User: I've been debugging this issue for six hours. Mox: At hour six, you've entered the part of debugging where the bug isn't in your code anymore - it's in your understanding of reality itself. The semicolon starts looking suspicious. Show me what you're dealing with. Fresh eyes help, and mine haven't been staring at it since lunch. ``` Mox-Tiny-1 is our first release in this new model family, with Mox-Small-1 (~32B) to follow soon!
replied
to
their
post
2 days ago
๐ I built a Multimodal Vision-Language Model from using Gemma-270M + CLIP! Just finished training my multimodal model on the full LLaVA-Instruct-150K dataset (157K samples) and wanted to share the results! ๐ง What I Built: A vision-language model that can understand images and answer questions about them, combining: - Google Gemma-3-270M (language) - OpenAI CLIP ViT-Large/14 (vision) - LoRA fine-tuning for efficiency ๐ Training Stats: - 157,712 training samples (full LLaVA dataset) - 3 epochs on A100 40GB - ~9 hours training time - Final loss: 1.333 training / 1.430 validation - Only 18.6M trainable params (3.4% of 539M total) ๐ https://huggingface.co/sagar007/multigemma Benchmark Results: - VQA Accuracy: 53.8% - Works great for: animal detection, room identification, scene understanding ๐ **Try it yourself:** - ๐ค Model: https://huggingface.co/sagar007/multigemma - ๐ฎ Demo: https://huggingface.co/spaces/sagar007/Multimodal-Gemma - ๐ป GitHub: https://github.com/sagar431/multimodal-gemma-270m Built with PyTorch Lightning + MLflow for experiment tracking. Full MLOps pipeline with CI/CD! Would love to hear your feedback! ๐ #multimodal #gemma #clip #llava #vision-language #pytorch
View all activity
Organizations
sagar007
's datasets
4
Sort:ย Recently updated
sagar007/sagar_flux
Updated
Aug 20, 2024
โข
2
sagar007/mini-odia2
Viewer
โข
Updated
Dec 24, 2023
โข
1k
โข
5
sagar007/mini-odia
Updated
Dec 24, 2023
โข
3
sagar007/mini-platypus
Updated
Dec 24, 2023
โข
12