UAT: Unified Audio-Text Diffusion for Audio Generation, Editing, and Captioning
Paper • 2606.04939 • Published
FSA/FST algorithms, differentiable, with PyTorch compatibility. Automatic speech recognition
ZipVoice voice cloning
Pocket TTS English voice cloning
WebAssembly speech enhancement
Detect spoken segments from your microphone in real time
Detect speech activity in real time from your microphone