DKYoon
/

mt5-small-lm-adapt

text2text-generation

Model card Files Files and versions

🤗 Language model initialized from mT5 and trained for an additional 100K steps on the Prefix LM objective using mC4 data.

Paper: Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

Authors: Tu Vu, Aditya Barua, Brian Lester, Daniel Cer, Mohit Iyyer, Noah Constant

PyTorch port of the original Flax checkpoint at Google/T5X repository.

Downloads last month: 13

Safetensors

Model size

0.3B params

Tensor type

F32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DKYoon/mt5-small-lm-adapt

Quantizations

Paper for DKYoon/mt5-small-lm-adapt

Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

Paper • 2205.12647 • Published May 25, 2022