Zhiyan (廌言)

A specialized TTS model for reconstructing Middle Chinese phonology.

Zhiyan (廌言) is a text-to-speech (TTS) model based on the International Phonetic Alphabet (IPA), designed to generate high-quality reconstructed audio of Middle Chinese (Qieyun phonological system).

This repository contains multiple versions of the model, trained using the same dataset but with different training strategies.


📦 Repository Structure

Zhiyan/
├── v1.2/
│   ├── zhiyan_v1.2.onnx
│   └── zhiyan_v1.2.json
├── v1.3/
│   ├── zhiyan_v1.3.onnx
│   └── zhiyan_v1.3.json
├── symbols.py
├── dictionary.txt
├── LICENSE
└── README.md

🔊 Model Versions

v1.2

  • Single-speaker model
  • Trained on all audio data combined

v1.3

  • Multi-speaker model (2 speakers)
  • Trained with separated speaker data
  • sid = 0 provides clearer output (recommended)

🚀 Usage

Requirements

  • A working environment of vits2_pytorch

Before inference:

  • Replace the original symbols.py in vits2_pytorch with the one provided in this repository

Inference

The ONNX model and config usage are the same as in vits2_pytorch. Please refer to the original project for details.

Example:

python infer_onnx.py \
  --model="zhiyan_v1.2.onnx" \
  --config-path="configs/zhiyan_v1.2.json" \
  --output-wav-path="output.wav" \
  --text="«ʈʉuŋ kó hɑ̌n ŋɨə́ , ŋɨə́ ʔịm ɦəp ʥiaŋ .»"

Corresponding Chinese text: 中古漢語,語音合成。


📝 Text Format Requirements

  • Input text must start with « and end with »
  • All syllables (including punctuation) must be separated by spaces

Allowed characters:

 ,.abdeghijklmnopstuwyzŋɑɕɖəɦɨɲɳʂʈʉʐʑʔʣʥʦʨʰʷ́̌ạẹịọỵꭦꭧ

📚 Dictionary

dictionary.txt contains over 26,000 Chinese characters with:

  • Character headword
  • Middle Chinese transcription
  • Definitions
  • Frequency information

🌐 Online Demo

You can use the model for free via the official service:

👉 https://qieyun-tts.com


📜 License

This model is released under Qieyun TTS Model License v2.1 (custom license).

Summary

  • ✅ Inference and audio generation allowed

  • ❌ No model training or fine-tuning

  • ❌ No dataset creation

  • ❌ No API, SaaS, or commercial deployment

For commercial licensing, please contact the author.

See LICENSE for full terms.


📬 Contact

For licensing or commercial use:

cinix.chen@gmail.com


❗ Disclaimer

Provided "as is" without warranty. The author is not liable for any damages.


廌言(Zhiyan)

一個用於重建中古漢語語音的專用 TTS 模型。

廌言(Zhiyan)是一個基於國際音標(IPA)的語音合成(TTS)模型,旨在生成高品質的中古漢語(切韻音系)重建語音。

本倉庫包含多個模型版本,這些模型使用相同數據集訓練,但採用了不同的訓練策略。


📦 倉庫結構

Zhiyan/
├── v1.2/
│   ├── zhiyan_v1.2.onnx
│   └── zhiyan_v1.2.json
├── v1.3/
│   ├── zhiyan_v1.3.onnx
│   └── zhiyan_v1.3.json
├── symbols.py
├── dictionary.txt
├── LICENSE
└── README.md

🔊 模型版本

v1.2

  • 單說話人模型
  • 使用全部音頻數據混合訓練

v1.3

  • 多說話人模型(2 個 speaker)
  • 分離 speaker 訓練
  • sid = 0 的語音更清晰(推薦)

🚀 使用方法

環境要求

  • 可正常運行的 vits2_pytorch 環境

推理前:

  • 使用本倉庫提供的 symbols.py 替換 vits2_pytorch 原有文件

推理

ONNX 模型與 config 的使用方式與 vits2_pytorch 相同,請參考原項目說明。

示例:

python infer_onnx.py \
  --model="zhiyan_v1.2.onnx" \
  --config-path="configs/zhiyan_v1.2.json" \
  --output-wav-path="output.wav" \
  --text="«ʈʉuŋ kó hɑ̌n ŋɨə́ , ŋɨə́ ʔịm ɦəp ʥiaŋ .»"

對應漢字:中古漢語,語音合成。


📝 文本格式要求

  • 輸入文本必須以 « 開頭,以 » 結尾
  • 所有音節(包括標點)需以空格分隔

允許字符:

 ,.abdeghijklmnopstuwyzŋɑɕɖəɦɨɲɳʂʈʉʐʑʔʣʥʦʨʰʷ́̌ạẹịọỵꭦꭧ

📚 字典

dictionary.txt 包含超過 26,000 個漢字,內容包括:

  • 字頭
  • 中古漢語轉寫
  • 釋義
  • 常用度信息

🌐 在線演示

可通過官方服務免費體驗:

👉 https://qieyun-tts.com


📜 授權條款 (License)

本模型依據 Qieyun TTS Model License v2.1(自訂授權協議)發佈。

內容摘要

  • ✅ 允許 進行推理(Inference)與音頻生成

  • ❌ 禁止 進行模型訓練或微調(Fine-tuning)

  • ❌ 禁止 用於建立數據集

  • ❌ 禁止 用於 API、SaaS 或任何商業化部署

如需商業授權,請聯繫作者。

詳見 LICENSE 文件。


📬 聯繫方式

授權或商業合作請聯繫:

cinix.chen@gmail.com


❗ 免責聲明

本項目按「現狀」提供,不附帶任何保證。作者不對任何損失承擔責任。

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support