Zhiyan (廌言)
A specialized TTS model for reconstructing Middle Chinese phonology.
Zhiyan (廌言) is a text-to-speech (TTS) model based on the International Phonetic Alphabet (IPA), designed to generate high-quality reconstructed audio of Middle Chinese (Qieyun phonological system).
This repository contains multiple versions of the model, trained using the same dataset but with different training strategies.
📦 Repository Structure
Zhiyan/
├── v1.2/
│ ├── zhiyan_v1.2.onnx
│ └── zhiyan_v1.2.json
├── v1.3/
│ ├── zhiyan_v1.3.onnx
│ └── zhiyan_v1.3.json
├── symbols.py
├── dictionary.txt
├── LICENSE
└── README.md
🔊 Model Versions
v1.2
- Single-speaker model
- Trained on all audio data combined
v1.3
- Multi-speaker model (2 speakers)
- Trained with separated speaker data
- sid = 0 provides clearer output (recommended)
🚀 Usage
Requirements
- A working environment of vits2_pytorch
Before inference:
- Replace the original
symbols.pyin vits2_pytorch with the one provided in this repository
Inference
The ONNX model and config usage are the same as in vits2_pytorch. Please refer to the original project for details.
Example:
python infer_onnx.py \
--model="zhiyan_v1.2.onnx" \
--config-path="configs/zhiyan_v1.2.json" \
--output-wav-path="output.wav" \
--text="«ʈʉuŋ kó hɑ̌n ŋɨə́ , ŋɨə́ ʔịm ɦəp ʥiaŋ .»"
Corresponding Chinese text: 中古漢語,語音合成。
📝 Text Format Requirements
- Input text must start with
«and end with» - All syllables (including punctuation) must be separated by spaces
Allowed characters:
,.abdeghijklmnopstuwyzŋɑɕɖəɦɨɲɳʂʈʉʐʑʔʣʥʦʨʰʷ́̌ạẹịọỵꭦꭧ
📚 Dictionary
dictionary.txt contains over 26,000 Chinese characters with:
- Character headword
- Middle Chinese transcription
- Definitions
- Frequency information
🌐 Online Demo
You can use the model for free via the official service:
📜 License
This model is released under Qieyun TTS Model License v2.1 (custom license).
Summary
✅ Inference and audio generation allowed
❌ No model training or fine-tuning
❌ No dataset creation
❌ No API, SaaS, or commercial deployment
For commercial licensing, please contact the author.
See LICENSE for full terms.
📬 Contact
For licensing or commercial use:
❗ Disclaimer
Provided "as is" without warranty. The author is not liable for any damages.
廌言(Zhiyan)
一個用於重建中古漢語語音的專用 TTS 模型。
廌言(Zhiyan)是一個基於國際音標(IPA)的語音合成(TTS)模型,旨在生成高品質的中古漢語(切韻音系)重建語音。
本倉庫包含多個模型版本,這些模型使用相同數據集訓練,但採用了不同的訓練策略。
📦 倉庫結構
Zhiyan/
├── v1.2/
│ ├── zhiyan_v1.2.onnx
│ └── zhiyan_v1.2.json
├── v1.3/
│ ├── zhiyan_v1.3.onnx
│ └── zhiyan_v1.3.json
├── symbols.py
├── dictionary.txt
├── LICENSE
└── README.md
🔊 模型版本
v1.2
- 單說話人模型
- 使用全部音頻數據混合訓練
v1.3
- 多說話人模型(2 個 speaker)
- 分離 speaker 訓練
- sid = 0 的語音更清晰(推薦)
🚀 使用方法
環境要求
- 可正常運行的 vits2_pytorch 環境
推理前:
- 使用本倉庫提供的
symbols.py替換 vits2_pytorch 原有文件
推理
ONNX 模型與 config 的使用方式與 vits2_pytorch 相同,請參考原項目說明。
示例:
python infer_onnx.py \
--model="zhiyan_v1.2.onnx" \
--config-path="configs/zhiyan_v1.2.json" \
--output-wav-path="output.wav" \
--text="«ʈʉuŋ kó hɑ̌n ŋɨə́ , ŋɨə́ ʔịm ɦəp ʥiaŋ .»"
對應漢字:中古漢語,語音合成。
📝 文本格式要求
- 輸入文本必須以
«開頭,以»結尾 - 所有音節(包括標點)需以空格分隔
允許字符:
,.abdeghijklmnopstuwyzŋɑɕɖəɦɨɲɳʂʈʉʐʑʔʣʥʦʨʰʷ́̌ạẹịọỵꭦꭧ
📚 字典
dictionary.txt 包含超過 26,000 個漢字,內容包括:
- 字頭
- 中古漢語轉寫
- 釋義
- 常用度信息
🌐 在線演示
可通過官方服務免費體驗:
📜 授權條款 (License)
本模型依據 Qieyun TTS Model License v2.1(自訂授權協議)發佈。
內容摘要
✅ 允許 進行推理(Inference)與音頻生成
❌ 禁止 進行模型訓練或微調(Fine-tuning)
❌ 禁止 用於建立數據集
❌ 禁止 用於 API、SaaS 或任何商業化部署
如需商業授權,請聯繫作者。
詳見 LICENSE 文件。
📬 聯繫方式
授權或商業合作請聯繫:
❗ 免責聲明
本項目按「現狀」提供,不附帶任何保證。作者不對任何損失承擔責任。