IWSLT/ted_talks_iwslt
Updated β’ 664 β’ 24
How to use dhintech/marian-tedtalks_clean-id-en with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "translation" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("translation", model="dhintech/marian-tedtalks_clean-id-en") # Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("dhintech/marian-tedtalks_clean-id-en")
model = AutoModelForSeq2SeqLM.from_pretrained("dhintech/marian-tedtalks_clean-id-en")This model is a fine-tuned version of Helsinki-NLP/opus-mt-id-en specialized for translating Indonesian to English, particularly within contexts found in TED Talks.
transformers library.Helsinki-NLP/opus-mt-id-enid) β English (en)from transformers import MarianMTModel, MarianTokenizer
model_name = "dhintech/marian-tedtalks_clean-id-en"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)
# Pindahkan model ke GPU jika tersedia
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
def translate(text):
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=128).to(device)
with torch.no_grad():
outputs = model.generate(**inputs, num_beams=4, early_stopping=True)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Contoh penggunaan
indonesian_text = "Selamat pagi, mari kita mulai rapat hari ini."
english_translation = translate(indonesian_text)
print(f"ID: {indonesian_text}")
print(f"EN: {english_translation}")
Performance metrics such as BLEU score, inference time, and human evaluation will be added here after the model has been fully trained and evaluated.
Feedback and contributions are welcome! Please use the Community tab or open an issue on the repository if you encounter any problems or have suggestions for improvement.
Base model
Helsinki-NLP/opus-mt-id-en