Carnegie Mellon

Welcome to the Carnegie Mellon

Pronunciation of Proper Names site

Speech at CMU | about | phonemes

The Proper Name Pronunciation tool has been temporarily disabled. Sorry about the trouble!

About the Pronunciation of Proper Names site

One of the current goals in speech synthesis is to acquire high quality pronunciations for proper names. There are several factors that make proper names especially hard to pronounce. Names can be of very diverse etymological origin and can surface in another language without undergoing the slow process of assimilation to the phonologic system of the new language. Furthermore, the number of distinct names tends to be very large. We can achieve about 50% coverage of general text with a dictionary containing only 141 words (Zipf's law), whereas we would need a list of more than 2,300 names to achieve that same coverage. [Coker, Church and Liberman, 1990]

My masters thesis [Font Llitjós, 2001] represents one step towards that goal. It proposes the hypothesis that an automatic pronunciation model can benefit from language origin information similarly to the way humans do when pronouncing proper names.

In [Font Llitjós, 2001], statistical pronunciation models are built specific for proper names in American English that take language origin information into account. When trying to classify names among 25 different languages, the pronunciation model achieved 60.23% word accuracy and, by reducing the number of languages, word accuracy increased to 67.87%. The baseline pronunciation model trained and tested on the whole CMU dictionary only achieves 57.8% word accuracy.

These numbers are still rather low, compared to similar systems for other languages (German 89.4%, French 93%, Thai 68%). The question arises of what percentage of the pronunciations generated by such models is actually acceptable to a human listener.

This site allows users to type in their names, and uses Edinburgh University's Festival to generate the phonetic transcription as well as an audio file according to (i) the CMU Pronouncing Dictionary, (ii) a baseline, ngram-based pronunciation model and (iii) a model incorporating language origin information.

Having listened to the audio files, the user is then asked to determine whether the pronunciations generated are correct, acceptable or unacceptable. If none of the pronunciations presented are acceptable, the user can enter the correct phonetic transcription (a phonetic table is provided as well as some transcription examples), and the corresponding audio file gets generated and presented to the user, so that s/he can confirm that it is indeed the correct pronunciation.

Therefore, in this early stage of the project, we are effectively getting expert humans to evaluate our pronunciation models. We say experts because everybody is an expert when it comes to pronouncing one's own name, and has a pretty good idea of what pronunciation is acceptable or unacceptable. Of course there is the caveat that some people are very sensitive to mispronunciations of their names, and the level of tolerance to synthesis mistakes might be lower than if they were listening to another word. However, we believe we can get valuable evaluation information this way, and it will be highly interesting to analyze user responses.

At later stages, our site will be able to present users with statistics on how people pronounce their names, and thus it will become a useful resource on proper names pronunciation.

Thank you for making it possible!

Ariadna Font Llitjós

US Phoneme Set

This phoneme set has 39 phonemes (disclaimer), not counting variations for lexical stress.

Lexical stress is indicated by appending a 1 after the stressed vowel. For example, if we want to indicate that the fourth syllable in "evaluation" is stressed, we would write:


Phoneme Examples
------- --------
aa fAther | wAshington
ae fAt | bAd
ah bUt | hUsh
ao lAWn | dOOr | mAll
aw hOW | sOUth | brOWser
ay hIde | bIble
eh gEt | fEAther
er fERtil | sEARch | makER
ey gAte | Ate
ih bIt | shIp
iy bEAt | shEEp
ow alOne | nOse
oy tOY | OYster
uh fUll | wOOd
uw fOOl | fOOd
b Book | aBrupt
ch CHart | larCH
d Done | baD
dh THat | faTHer
f Fat | lauGH
g Good | biGGer
hh Hello | loopHole
jh diGit | Jack
k Camera | jaCK | Kill
l Late | fuLL
m Man | gaMe
n maN | New
ng baNG | sittiNG
p Pat | camPer
r Reason | caR
s Sit | maSS
sh SHip | claSH
t Tap | baT
th THeatre | baTH
v Various | haVe
w Water | cobWeb
y Yellow | Yacht
z Zero | quiZ | boyS
zh viSion | caSual

Disclaimer: this list of phonemes is not standard. It is the one used in the applications developed by the Speech Group at CMU, and for the project at hand, we believe it to be suitable, since making finer distinctions would most likely confuse users that don't have much knowledge of phonetics.

Back to top

Ariadna Font Llitjós
Last modified: Sat Nov 24 20:48:20 EST 2001