AI & ML interests

Speech and NLP systems for Indian legal infrastructure

Recent Activity

Articles

kavyamanohar 
posted an update 5 days ago
view post
Post
4541
Releasing Vividh-ASR — an open benchmark and models for Hindi and Malayalam ASR.

Vividh-ASR is built from public data, stratified by complexity:
→ Clean recordings
→ Noisy and accented speech
→ Spontaneous, conversational audio

Alongside the benchmark, we release:
→ Open models for Hindi and Malayalam
→ A training recipe with two counterintuitive choices that moved the needle
→ What failed, not just what worked

The stratified evaluation methodology transfers directly to any low-resource language setup — beyond Hindi and Malayalam.

Built at @adalatai , where we build speech tech for Indian courts. This is our first open contribution back to the community. @janaab @Kush0610 @orgh0

Link: https://huggingface.co/blog/adalat-ai/vividh-benchmark