wmt/wmt16
Viewer β’ Updated β’ 9.98M β’ 8.83k β’ 26
How to use Aparna852/de-en-translator with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "translation" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("translation", model="Aparna852/de-en-translator") # Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("Aparna852/de-en-translator")
model = AutoModelForSeq2SeqLM.from_pretrained("Aparna852/de-en-translator")A transformer-based German β English translation model fine-tuned on a custom split of the WMT16 (de-en) dataset using π€ Transformers and Seq2SeqTrainer.
Aparna852/german-english-translator (fine-tuned)wmt/wmt16 - de-en)sacrebleu)| Parameter | Value |
|---|---|
| Dataset | wmt/wmt16 (German-English) |
| Train Size | ~2.5% of original training set |
| Validation Size | ~2.8% of original validation |
| Max Length | 64 |
| Epochs | 3 |
| Train Batch Size | 4 |
| Eval Batch Size | 4 |
| Gradient Accumulation | 8 |
| Learning Rate | 1e-5 |
| Weight Decay | 0.03 |
| Warmup Steps | 500 |
| FP16 (Mixed Precision) | True (if CUDA available) |
| Scheduler | linear |
| Evaluation Strategy | epoch |
| Save Strategy | epoch |
| Logging Steps | 10 |
| Early Stopping | patience=2 |
| Metric for Best Model | eval_loss |
| Trainer API | Seq2SeqTrainer from π€ Transformers |
You can run the evaluation after training using:
from evaluate import load
bleu = load("sacrebleu")
# Compute BLEU on tokenized test dataset
preds = [...] # Generated translations
refs = [...] # Reference translations
bleu.compute(predictions=preds, references=[[r] for r in refs])