view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 16 days ago • 91
TokSuite: Measuring the Impact of Tokenizer Choice on Language Model Behavior Paper • 2512.20757 • Published 10 days ago • 16
EasyV2V: A High-quality Instruction-based Video Editing Framework Paper • 2512.16920 • Published 15 days ago • 17
supertoken Collection The initial checkpoints for the token comparison research. • 20 items • Updated May 22, 2025 • 2
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published Jun 5, 2025 • 59