Mozilla/flickr30k-transformed-captions
Viewer • Updated • 31k • 680 • 6
How to use tarekziade/distilvit-pexels-frozen with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "image-to-text" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("image-to-text", model="tarekziade/distilvit-pexels-frozen") # Load model directly
from transformers import AutoTokenizer, AutoModelForImageTextToText
tokenizer = AutoTokenizer.from_pretrained("tarekziade/distilvit-pexels-frozen")
model = AutoModelForImageTextToText.from_pretrained("tarekziade/distilvit-pexels-frozen")This model is a work in progress. Fine-tuned version of those base models:
This model was trained on:
You can get that checkpoint using the 3083a3cef6e3c8dd90df3f088074bbe836b0f403 commit.
It was then further fine-tuned on :
For the latter, the dataset was annotated by our team to correct the alt text generated by the model, using the checkvite tool.
You can find the code used to create the model here: https://github.com/mozilla/distilvit
Base model
google/vit-base-patch16-224-in21k