Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
tuandunghcmut
's Collections
Document Layout Analysis Dataset
Agentic Benchmarks
Post-training Dataset
RL-Papers
MT-LLM
Visual Chain-of-Thought Reasoning Benchmarks
LLM for Security Benchmarks/Datasets
Visual-CoT/GCoT related
Text Embedding Papers
EMPTY A
Quantized versions of LLMs/MLLMs
Multilingual Sentiment Analysis Dataset
LLM Series
LLM/MLLM (20B - 80B, fit on 1-2 A100/H100)
SLM
MLLM (100B - 300B)
Benchmarks for evaluating LLMs/MLLMs
Conversation Dataset
Multilingual Parallel Text Corpus
Multilingual Pretraining Corpus for Southeast Asian Language
Multilingual Parallel Text Corpus
updated
Jan 8
Upvote
-
vietgpt/opus100_envi
Viewer
•
Updated
Jul 3, 2023
•
1M
•
26
•
4
tuandunghcmut/PhoMT-MTet-Mixture
Viewer
•
Updated
Aug 11, 2025
•
7.62M
•
112
•
1
airesearch/scb_mt_enth_2020
Updated
Jan 18, 2024
•
83
•
9
Helsinki-NLP/opus_paracrawl
Viewer
•
Updated
Feb 22, 2024
•
27.3M
•
406
•
6
Helsinki-NLP/opus_books
Viewer
•
Updated
Mar 29, 2024
•
1.25M
•
17.3k
•
87
Helsinki-NLP/open_subtitles
Updated
Jan 18, 2024
•
720
•
75
Helsinki-NLP/OpenSubtitles2024
Viewer
•
Updated
Jun 23, 2025
•
570M
•
104
•
3
Upvote
-
Share collection
View history
Collection guide
Browse collections