Upload README.md
Browse files
README.md
CHANGED
|
@@ -43,109 +43,127 @@ language:
|
|
| 43 |
|
| 44 |
|
| 45 |
<p align="center">
|
| 46 |
-
<img src="https://
|
| 47 |
</p><p></p>
|
| 48 |
|
| 49 |
|
| 50 |
<p align="center">
|
| 51 |
-
🤗 <a href="https://huggingface.co/collections/tencent/
|
| 52 |
-
🕹️ <a href="https://hunyuan.tencent.com/
|
| 53 |
-
🤖 <a href="https://modelscope.cn/collections/Hunyuan-
|
| 54 |
</p>
|
| 55 |
|
| 56 |
<p align="center">
|
| 57 |
🖥️ <a href="https://hunyuan.tencent.com"><b>Official Website</b></a> |
|
| 58 |
-
<a href="https://github.com/Tencent-Hunyuan/
|
| 59 |
-
<a href="https://www.arxiv.org/abs/2509.05209"><b>Technical Report</b></a>
|
| 60 |
</p>
|
| 61 |
|
| 62 |
|
| 63 |
## Model Introduction
|
| 64 |
|
| 65 |
-
|
| 66 |
|
| 67 |
-
##
|
| 68 |
|
| 69 |
-
-
|
| 70 |
-
-
|
| 71 |
-
-
|
| 72 |
-
-
|
| 73 |
|
| 74 |
## Related News
|
| 75 |
-
* 2025.
|
|
|
|
| 76 |
<br>
|
| 77 |
|
| 78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
|
| 80 |
|
| 81 |
-
##
|
| 82 |
| Model Name | Description | Download |
|
| 83 |
| ----------- | ----------- |-----------
|
| 84 |
-
|
|
| 85 |
-
|
|
| 86 |
-
|
|
| 87 |
-
|
|
| 88 |
|
| 89 |
## Prompts
|
| 90 |
|
| 91 |
### Prompt Template for ZH<=>XX Translation.
|
| 92 |
-
|
| 93 |
```
|
|
|
|
| 94 |
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
<source_text>
|
| 98 |
-
|
| 99 |
```
|
|
|
|
| 100 |
|
| 101 |
### Prompt Template for XX<=>XX Translation, excluding ZH<=>XX.
|
| 102 |
-
|
| 103 |
```
|
|
|
|
| 104 |
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
<source_text>
|
| 108 |
-
|
| 109 |
```
|
|
|
|
| 110 |
|
| 111 |
-
### Prompt Template for
|
|
|
|
|
|
|
|
|
|
|
|
|
| 112 |
|
|
|
|
|
|
|
| 113 |
```
|
|
|
|
| 114 |
|
| 115 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
|
| 117 |
-
|
| 118 |
-
|
| 119 |
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
4. ```<translated_text4>```
|
| 125 |
-
5. ```<translated_text5>```
|
| 126 |
-
6. ```<translated_text6>```
|
| 127 |
|
|
|
|
| 128 |
```
|
|
|
|
| 129 |
|
| 130 |
|
| 131 |
|
| 132 |
### Use with transformers
|
| 133 |
First, please install transformers, recommends v4.56.0
|
| 134 |
```SHELL
|
| 135 |
-
pip install transformers==
|
| 136 |
```
|
| 137 |
|
| 138 |
-
The following code snippet shows how to use the transformers library to load and apply the model.
|
| 139 |
-
|
| 140 |
*!!! If you want to load fp8 model with transformers, you need to change the name"ignored_layers" in config.json to "ignore" and upgrade the compressed-tensors to compressed-tensors-0.11.0.*
|
| 141 |
|
| 142 |
-
|
|
|
|
|
|
|
| 143 |
|
| 144 |
```python
|
| 145 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 146 |
import os
|
| 147 |
|
| 148 |
-
model_name_or_path = "tencent/
|
| 149 |
|
| 150 |
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
|
| 151 |
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map="auto") # You may want to use bfloat16 and/or move to GPU here
|
|
@@ -174,6 +192,8 @@ We recommend using the following set of parameters for inference. Note that our
|
|
| 174 |
}
|
| 175 |
```
|
| 176 |
|
|
|
|
|
|
|
| 177 |
Supported languages:
|
| 178 |
| Languages | Abbr. | Chinese Names |
|
| 179 |
|-------------------|---------|-----------------|
|
|
@@ -214,19 +234,4 @@ Supported languages:
|
|
| 214 |
| Kazakh | kk | 哈萨克语 |
|
| 215 |
| Mongolian | mn | 蒙古语 |
|
| 216 |
| Uyghur | ug | 维吾尔语 |
|
| 217 |
-
| Cantonese | yue | 粤语 |
|
| 218 |
-
|
| 219 |
-
|
| 220 |
-
Citing Hunyuan-MT:
|
| 221 |
-
|
| 222 |
-
```bibtex
|
| 223 |
-
@misc{hunyuan_mt,
|
| 224 |
-
title={Hunyuan-MT Technical Report},
|
| 225 |
-
author={Mao Zheng and Zheng Li and Bingxin Qu and Mingyang Song and Yang Du and Mingrui Sun and Di Wang},
|
| 226 |
-
year={2025},
|
| 227 |
-
eprint={2509.05209},
|
| 228 |
-
archivePrefix={arXiv},
|
| 229 |
-
primaryClass={cs.CL},
|
| 230 |
-
url={https://arxiv.org/abs/2509.05209},
|
| 231 |
-
}
|
| 232 |
-
```
|
|
|
|
| 43 |
|
| 44 |
|
| 45 |
<p align="center">
|
| 46 |
+
<img src="https://github.com/Tencent-Hunyuan/HY-MT/raw/main/imgs/hunyuanlogo.png" width="400"/> <br>
|
| 47 |
</p><p></p>
|
| 48 |
|
| 49 |
|
| 50 |
<p align="center">
|
| 51 |
+
🤗 <a href="https://huggingface.co/collections/tencent/hy-mt15"><b>Hugging Face</b></a> |
|
| 52 |
+
🕹️ <a href="https://hunyuan.tencent.com/chat/HunyuanDefault?from=modelSquare&modelId=hunyuan-mt-1.8b"><b>Demo</b></a>
|
| 53 |
+
🤖 <a href="https://modelscope.cn/collections/Tencent-Hunyuan/HY-MT15"><b>ModelScope</b></a> |
|
| 54 |
</p>
|
| 55 |
|
| 56 |
<p align="center">
|
| 57 |
🖥️ <a href="https://hunyuan.tencent.com"><b>Official Website</b></a> |
|
| 58 |
+
<a href="https://github.com/Tencent-Hunyuan/HY-MT"><b>Github</b></a>
|
|
|
|
| 59 |
</p>
|
| 60 |
|
| 61 |
|
| 62 |
## Model Introduction
|
| 63 |
|
| 64 |
+
Hunyuan Translation Model Version 1.5 includes a 1.8B translation model, HY-MT1.5-1.8B, and a 7B translation model, HY-MT1.5-7B. Both models focus on supporting mutual translation across 33 languages and incorporating 5 ethnic and dialect variations. Among them, HY-MT1.5-7B is an upgraded version of our WMT25 championship model, optimized for explanatory translation and mixed-language scenarios, with newly added support for terminology intervention, contextual translation, and formatted translation. Despite having less than one-third the parameters of HY-MT1.5-7B, HY-MT1.5-1.8B delivers translation performance comparable to its larger counterpart, achieving both high speed and high quality. After quantization, the 1.8B model can be deployed on edge devices and support real-time translation scenarios, making it widely applicable.
|
| 65 |
|
| 66 |
+
## Key Features and Advantages
|
| 67 |
|
| 68 |
+
- HY-MT1.5-1.8B achieves the industry-leading performance among models of the same size, surpassing most commercial translation APIs.
|
| 69 |
+
- HY-MT1.5-1.8B supports deployment on edge devices and real-time translation scenarios, offering broad applicability.
|
| 70 |
+
- HY-MT1.5-7B, compared to its September open-source version, has been optimized for annotated and mixed-language scenarios.
|
| 71 |
+
- Both models support terminology intervention, contextual translation, and formatted translation.
|
| 72 |
|
| 73 |
## Related News
|
| 74 |
+
* 2025.12.30, we have open-sourced **HY-MT1.5-1.8B** and **HY-MT1.5-7B** on Hugging Face.
|
| 75 |
+
* 2025.9.1, we have open-sourced **Hunyuan-MT-7B** , **Hunyuan-MT-Chimera-7B** on Hugging Face.
|
| 76 |
<br>
|
| 77 |
|
| 78 |
|
| 79 |
+
## Performance
|
| 80 |
+
|
| 81 |
+
<div align='center'>
|
| 82 |
+
<img src="https://github.com/Tencent-Hunyuan/HY-MT/raw/main/imgs/overall_performance.png" width = "80%" />
|
| 83 |
+
</div>
|
| 84 |
+
You can refer to our technical report for more experimental results and analysis.
|
| 85 |
+
|
| 86 |
+
<a href="https://github.com/Tencent-Hunyuan/Hunyuan-MT/raw/main/HY_MT1_5_Technical_Report.pdf"><b>Technical Report</b> </a>
|
| 87 |
+
|
| 88 |
|
| 89 |
|
| 90 |
+
## Model Links
|
| 91 |
| Model Name | Description | Download |
|
| 92 |
| ----------- | ----------- |-----------
|
| 93 |
+
| HY-MT1.5-1.8B | Hunyuan 1.8B translation model |🤗 [Model](https://huggingface.co/tencent/HY-MT1.5-1.8B)|
|
| 94 |
+
| HY-MT1.5-1.8B-FP8 | Hunyuan 1.8B translation model, fp8 quant | 🤗 [Model](https://huggingface.co/tencent/HY-MT1.5-1.8B-FP8)|
|
| 95 |
+
| HY-MT1.5-7B | Hunyuan 7B translation model | 🤗 [Model](https://huggingface.co/tencent/HY-MT1.5-7B)|
|
| 96 |
+
| HY-MT1.5-7B-FP8 | Hunyuan 7B translation model, fp8 quant | 🤗 [Model](https://huggingface.co/tencent/HY-MT1.5-7B-FP8)|
|
| 97 |
|
| 98 |
## Prompts
|
| 99 |
|
| 100 |
### Prompt Template for ZH<=>XX Translation.
|
| 101 |
+
---
|
| 102 |
```
|
| 103 |
+
将以下文本翻译为{target_language},注意只需要输出翻译后的结果,不要额外解释:
|
| 104 |
|
| 105 |
+
{source_text}
|
|
|
|
|
|
|
|
|
|
| 106 |
```
|
| 107 |
+
---
|
| 108 |
|
| 109 |
### Prompt Template for XX<=>XX Translation, excluding ZH<=>XX.
|
| 110 |
+
---
|
| 111 |
```
|
| 112 |
+
Translate the following segment into {target_language}, without additional explanation.
|
| 113 |
|
| 114 |
+
{source_text}
|
|
|
|
|
|
|
|
|
|
| 115 |
```
|
| 116 |
+
---
|
| 117 |
|
| 118 |
+
### Prompt Template for terminology intervention.
|
| 119 |
+
---
|
| 120 |
+
```
|
| 121 |
+
参考下面的翻译:
|
| 122 |
+
{source_term} 翻译成 {target_term}
|
| 123 |
|
| 124 |
+
将以下文本翻译为{target_language},注意只需要输出翻译后的结果,不要额外解释:
|
| 125 |
+
{source_text}
|
| 126 |
```
|
| 127 |
+
---
|
| 128 |
|
| 129 |
+
### Prompt Template for contextual translation.
|
| 130 |
+
---
|
| 131 |
+
```
|
| 132 |
+
{context}
|
| 133 |
+
参考上面的信息,把下面的文本翻译成{target_language},注意不需要翻译上文,也不要额外解释:
|
| 134 |
+
{source_text}
|
| 135 |
|
| 136 |
+
```
|
| 137 |
+
---
|
| 138 |
|
| 139 |
+
### Prompt Template for formatted translation.
|
| 140 |
+
---
|
| 141 |
+
```
|
| 142 |
+
将以下<source></source>之间的文本翻译为中文,注意只需要输出翻译后的结果,不要额外解释,原文中的<sn></sn>标签表示标签内文本包含格式信息,需要在译文中相应的位置尽量保留该标签。输出格式为:<target>str</target>
|
|
|
|
|
|
|
|
|
|
| 143 |
|
| 144 |
+
<source>{src_text_with_format}</source>
|
| 145 |
```
|
| 146 |
+
---
|
| 147 |
|
| 148 |
|
| 149 |
|
| 150 |
### Use with transformers
|
| 151 |
First, please install transformers, recommends v4.56.0
|
| 152 |
```SHELL
|
| 153 |
+
pip install transformers==4.56.0
|
| 154 |
```
|
| 155 |
|
|
|
|
|
|
|
| 156 |
*!!! If you want to load fp8 model with transformers, you need to change the name"ignored_layers" in config.json to "ignore" and upgrade the compressed-tensors to compressed-tensors-0.11.0.*
|
| 157 |
|
| 158 |
+
The following code snippet shows how to use the transformers library to load and apply the model.
|
| 159 |
+
|
| 160 |
+
we use tencent/HY-MT1.5-1.8B for example
|
| 161 |
|
| 162 |
```python
|
| 163 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 164 |
import os
|
| 165 |
|
| 166 |
+
model_name_or_path = "tencent/HY-MT1.5-1.8B"
|
| 167 |
|
| 168 |
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
|
| 169 |
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, device_map="auto") # You may want to use bfloat16 and/or move to GPU here
|
|
|
|
| 192 |
}
|
| 193 |
```
|
| 194 |
|
| 195 |
+
|
| 196 |
+
|
| 197 |
Supported languages:
|
| 198 |
| Languages | Abbr. | Chinese Names |
|
| 199 |
|-------------------|---------|-----------------|
|
|
|
|
| 234 |
| Kazakh | kk | 哈萨克语 |
|
| 235 |
| Mongolian | mn | 蒙古语 |
|
| 236 |
| Uyghur | ug | 维吾尔语 |
|
| 237 |
+
| Cantonese | yue | 粤语 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|