Automatic Speech Recognition
Transformers
Safetensors
seamless_m4t_v2
feature-extraction
audio-to-audio
text-to-speech
seamless_communication
Instructions to use facebook/seamless-m4t-v2-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use facebook/seamless-m4t-v2-large with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="facebook/seamless-m4t-v2-large")# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("facebook/seamless-m4t-v2-large") model = AutoModel.from_pretrained("facebook/seamless-m4t-v2-large") - Notebooks
- Google Colab
- Kaggle
Questions on the model capabilities and features
#50
by yevgeniyilyin - opened
We started a PoC with this model and there are couple of question we weren't able to find answer in the documentation:
- Is there an option to generate a female voice?
- Is there any options to do tuning using tags for: talking speed, declinations, correct pronunciation of currencies and dates, higher or lower voice pitch. Any mechanisms to achieve this with the model?