Towards a Phonology-Informed Evaluation of Multilingual TTS
Neural TTS systems can sound natural across languages, but naturalness does not guarantee the preservation of sound contrasts that distinguish words from their grammatical forms. Standard metrics like MOS do not test for this. We propose a classifier-based framework that audits TTS output against language-specific phonological patterns using human speech as a benchmark. Testing Assamese advanced tongue root (ATR) vowel harmony with Meta's MMS TTS, we show that a classifier trained on human speech transfers to synthesized speech with minimal loss. The faithfulness audit reveals that [+ATR] mid
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Why these links exist
- Linked via arxiv authorSneha Ray Barman →
Towards a Phonology-Informed Evaluation of Multilingual TTS
- Linked via arxiv authorNeeraj Kumar Sharma →
Towards a Phonology-Informed Evaluation of Multilingual TTS
- Linked via arxiv authorShakuntala Mahanta →
Towards a Phonology-Informed Evaluation of Multilingual TTS
