paperarXivTrust 82 · PrimaryPublished yesterdayLive · 19h ago

Using embeddings to predict spoken word duration and pitch in Mandarin monosyllabic words

Time-normalized f0 contours of Mandarin words in conversational speech have been shown to be predictable in part from their contextualized embeddings (CEs). The present study investigates whether CEs also predict spoken word duration for 7470 tokens of Mandarin monosyllabic CV words extracted from a Mandarin corpus of spontaneous speech. We show that CEs indeed are predictive for duration, above chance level, not only at the type level, but also at the level of individual tokens, as indicated by the results obtained with the type-wise and token-wise permutation baselines. We also show that the

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

Linked via arxiv authorXiaoyun Jin →
Using embeddings to predict spoken word duration and pitch in Mandarin monosyllabic words
Linked via arxiv authorMirjam Ernestus →
Using embeddings to predict spoken word duration and pitch in Mandarin monosyllabic words
Linked via arxiv authorR. Harald Baayen →
Using embeddings to predict spoken word duration and pitch in Mandarin monosyllabic words

authored (incoming)

personXiaoyun Jin personMirjam Ernestus personR. Harald Baayen

Related across the graph

personMirjam Ernestus personXiaoyun Jin personR. Harald Baayen

Topics

cs.CL