Zhiyan (廌言)

A specialized TTS model for reconstructing Middle Chinese phonology.

Zhiyan (廌言) is a text-to-speech (TTS) model based on the International Phonetic Alphabet (IPA), designed to generate high-quality reconstructed audio of Middle Chinese (Qieyun phonological system).

This repository contains multiple versions of the model, trained using the same dataset but with different training strategies.

📦 Repository Structure

Zhiyan/
├── v1.2/
│   ├── zhiyan_v1.2.onnx
│   └── zhiyan_v1.2.json
├── v1.3/
│   ├── zhiyan_v1.3.onnx
│   └── zhiyan_v1.3.json
├── symbols.py
├── dictionary.txt
├── LICENSE
└── README.md

🔊 Model Versions

v1.2

Single-speaker model
Trained on all audio data combined

v1.3

Multi-speaker model (2 speakers)
Trained with separated speaker data
sid = 0 provides clearer output (recommended)

🚀 Usage

Requirements

A working environment of vits2_pytorch

Before inference:

Replace the original symbols.py in vits2_pytorch with the one provided in this repository

Inference

The ONNX model and config usage are the same as in vits2_pytorch. Please refer to the original project for details.

Example:

python infer_onnx.py \
  --model="zhiyan_v1.2.onnx" \
  --config-path="configs/zhiyan_v1.2.json" \
  --output-wav-path="output.wav" \
  --text="«ʈʉuŋ kó hɑ̌n ŋɨə́ , ŋɨə́ ʔịm ɦəp ʥiaŋ .»"

Corresponding Chinese text: 中古漢語，語音合成。

📝 Text Format Requirements

Input text must start with « and end with »
All syllables (including punctuation) must be separated by spaces

Allowed characters:

 ,.abdeghijklmnopstuwyzŋɑɕɖəɦɨɲɳʂʈʉʐʑʔʣʥʦʨʰʷ́̌ạẹịọỵꭦꭧ

📚 Dictionary

dictionary.txt contains over 26,000 Chinese characters with:

Character headword
Middle Chinese transcription
Definitions
Frequency information

🌐 Online Demo

You can use the model for free via the official service:

👉 https://qieyun-tts.com

📜 License

This model is released under Qieyun TTS Model License v2.1 (custom license).

Summary

✅ Inference and audio generation allowed
❌ No model training or fine-tuning
❌ No dataset creation
❌ No API, SaaS, or commercial deployment

For commercial licensing, please contact the author.

See LICENSE for full terms.

📬 Contact

For licensing or commercial use:

cinix.chen@gmail.com

❗ Disclaimer

Provided "as is" without warranty. The author is not liable for any damages.

廌言（Zhiyan）

一個用於重建中古漢語語音的專用 TTS 模型。

廌言（Zhiyan）是一個基於國際音標（IPA）的語音合成（TTS）模型，旨在生成高品質的中古漢語（切韻音系）重建語音。

本倉庫包含多個模型版本，這些模型使用相同數據集訓練，但採用了不同的訓練策略。

📦 倉庫結構

Zhiyan/
├── v1.2/
│   ├── zhiyan_v1.2.onnx
│   └── zhiyan_v1.2.json
├── v1.3/
│   ├── zhiyan_v1.3.onnx
│   └── zhiyan_v1.3.json
├── symbols.py
├── dictionary.txt
├── LICENSE
└── README.md

🔊 模型版本

v1.2

單說話人模型
使用全部音頻數據混合訓練

v1.3

多說話人模型（2 個 speaker）
分離 speaker 訓練
sid = 0 的語音更清晰（推薦）

🚀 使用方法

環境要求

可正常運行的 vits2_pytorch 環境

推理前：

使用本倉庫提供的 symbols.py 替換 vits2_pytorch 原有文件

推理

ONNX 模型與 config 的使用方式與 vits2_pytorch 相同，請參考原項目說明。

示例：

python infer_onnx.py \
  --model="zhiyan_v1.2.onnx" \
  --config-path="configs/zhiyan_v1.2.json" \
  --output-wav-path="output.wav" \
  --text="«ʈʉuŋ kó hɑ̌n ŋɨə́ , ŋɨə́ ʔịm ɦəp ʥiaŋ .»"

對應漢字：中古漢語，語音合成。

📝 文本格式要求

輸入文本必須以 « 開頭，以 » 結尾
所有音節（包括標點）需以空格分隔

允許字符：

 ,.abdeghijklmnopstuwyzŋɑɕɖəɦɨɲɳʂʈʉʐʑʔʣʥʦʨʰʷ́̌ạẹịọỵꭦꭧ

📚 字典

dictionary.txt 包含超過 26,000 個漢字，內容包括：

字頭
中古漢語轉寫
釋義
常用度信息

🌐 在線演示

可通過官方服務免費體驗：

👉 https://qieyun-tts.com

📜 授權條款 (License)

本模型依據 Qieyun TTS Model License v2.1（自訂授權協議）發佈。

內容摘要

✅ 允許進行推理（Inference）與音頻生成
❌ 禁止進行模型訓練或微調（Fine-tuning）
❌ 禁止用於建立數據集
❌ 禁止用於 API、SaaS 或任何商業化部署

如需商業授權，請聯繫作者。

詳見 LICENSE 文件。

📬 聯繫方式

授權或商業合作請聯繫：

cinix.chen@gmail.com

❗ 免責聲明

本項目按「現狀」提供，不附帶任何保證。作者不對任何損失承擔責任。

Downloads last month: -; Downloads are not tracked for this model. How to track