ASCI I Phonetic Symb ols for the World's Languages: Worldb et
James L. Hieronymus
AT&T Bell Lab oratories, Murray Hill, NJ 07974, USA
Abstract
A new ASCI I enco ding of the International Phonetic Alphab et IPA and additional symb ols for
sp eech database lab eling has b een designed for all languages. Many of the previous ASCI I versions
of the IPAwere targeted at Europ ean languages and therefore left out many of the sounds of the
other languages or used IPAsymb ols for non-Europ ean sounds likeclicks, for plosive bursts. When
an attempt was made to lab el a large numb er of languages with phonemic and phonetic symb ols,
these were found to b e inadequate. The presentscheme b orrows on earlier work by George Allen,
Ian Maddieson, John Wells, Laver et al. and Hieronymus et al. Wherever p ossible, the present
scheme was made similar to the base IPA symb ols, so that many of the symb ols will seem to have
obvious meanings. Many of the symb ols are the same as other schemes. The underlying principle is
that any sp ectrally and temp orally distinct sp eech sound not including pitch which is phonemic
in some language should have a separate base symb ol. In most cases the base symb ol consists of a
concatination of an IPA symb ol and diacritics. Thus it is easy to recognize the phonemic base symb ols
and compare the same broad phonetic sound across languages. Tone languages have diacritics applied
to the vowel phoneme symb ols to prop erly identify the phonemes in these languages. Allophonic
variations due to contextural coarticulation and stress may b e lab elled by a diacritic attached to
the base symb ol. It is p ossible that some sp eech sounds which are phonemic in at least one of the
world's languages, are missing from the presentversion. It is hop ed that anyoversights will b e
corrected in subsequentversions of Worldb et, and a standard metho d for constructing new symb ols
is presented.
Intro duction
Many systems have b een develop ed for writing the sounds of the world's languages. Manyofthe
early workers made their own systems b ecause there was no agreed standard or indeed knowledge
of the complete sp eech sound inventory. The International Phonetic Alphab et was develop ed in
1888 and revised several times into its present form. It represents 105 years of exp erience with
putting a symbol to each sound in all of the known languages in the world. The issues of economyof
representation and the distinction b etween allophonic variation and true baseform sound have b een
worked out for many more languages since the IPAwas originally formulated. Therefore it is a go o d
place to b egin for anymultilanguage sp eech database lab elling e ort.
There are some sounds which are not normally included in the IPA whichhave b een found to
b e useful in lab elling large sp eech corp ora like TIMIT, SCRIBE, BDSONS, and PHONDAT. These
mo dern attempts at a standard ascii form of the IPA resulted in TIMITBET, MRPA, SAMPA, and
SAMPA Extended to name a few of them. These phonetic alphab ets were restricted to English or
to Europ ean languages, and thus were to o restricted in scop e to b e used in other ma jor language
families. The issue is whether or not the ascii representation is consistent, complete and logical for
alloftheIPAsymb ols.
Worldb et is an attempt to have a phonetic alphab et whichcovers all of the world's languages
in a systematic fashion. It is an ascii version of the IPAplusanumber of symb ols whichwere found
useful in database lab elling, which are not currently in the ocial IPA set. This list of extra symb ols
may grow with time until all of the imp ortant phenomena have a coherent symb ol representation.
This pap er is organized to rst cover the general principles of Worldb et, discuss earlier lab eling
sets, give sp eci c symb ol assignments, and discuss lab eling metho ds. In App endix A is an exhaustive
list the Worldb et symb ols and their corresp onding lab els in a few other systems, namely TIMITBET,
SAMPA and JBET a phonetic alphab et used in sp eech synthesis. App endix B is a table of place
and manner of articulation v.s. Worldb et symb ols. In App endix C there are examples of Worldb et
symbol inventories for several languages. 1
General Principles
Worldb et is an ASCI I version of International Phonetic Alphab et IPA with additional broad
phonetic symb ols not presently in the IPA. It is designed for a large set of languages including
Indian, Asian, African and Europ ean languages. Considerations of the sp ecial sounds in eachof
these languages lead to the principle that eachbasesymb ol will represent a sp eech sound with
a sp ectrally distinct time sequence. Eachtyp e of /r/ will have its separate IPAlike designation,
rather than the more graphemic r used in some lab el sets. Allophones like aspirated plosives will
have a separate base symb ol from unaspirated plosives, if they are phonemic within the language in
question, otherwise they will b e marked using the base symb ol plus diacritic. Distinct means to b e
so di erent sp ectrally or temp orally as to b e p erceptually di erent, when the comp onents are heard
in isolation.
Vowels are classed into nominal place p ositions. It is recognized that the detailed vowel color
mayvary b etween languages for the same nominal vowel, yet separate symb ols will b e assigned only
when the di erences are large enough to constitute di erent phonemes. In actual lab eling exp erience
it has b een found that most of the di erences in phonetic lab els b etween trained phoneticians were
due to disagreements on the detailed vowel color, rather than the actual broad vowel color.
Therefore, Worldb et base symb ols will represent phonemic distinctions in some language, as in
the plosive example. The base symb ols are thus meant to b e broad phonetic, but can b e used as
surface phonemic symb ols within a given language as stated in the original principles of the IPA
[1]. Since the IPA has b een in use for over 100 years and has b een actively develop ed and evolved
over this p erio d [2], it should have all of the phonemic distinctions observed in the world's languages
to date. Therefore it is the natural starting p ointforany attempt to construct a phoneme set which
is sucienttocover all of the languages in the world.
Diacritics are used in general to mo dify the base symb ols to deal with allophones whichare
due to coarticulation e ects i.e.: labialized /s/ in the environment of /w/, or phonological context
e ects. The diacritic allows the particular allophone to b e marked, which has as its base character
the phonemically based broad phone which is the origin of this allophone. Of course it is not always
easy to determine what is an allophonic variation and what is a change of broad phonetic category.
Normally the numb er of symb ols to b e used to lab el a particular language will b e limited, to keep
from having an overly large lab el inventory.
The motivating factor for Worldb et is to lab el sp eech for corpus driven sp eech research, phono-
logical inventories, automatic language identi cation, multi-language sp eech recognition, and multi-
language sp eech synthesis. It should also b e useful in constructing multi-lingual dictionaries. In all
of the ab ove uses it is most convenienttohaveeach sound lab eled with a particular symb ol closely
resemble all other sounds with the same lab el, no matter in which language it is uttered.
Previous Lab el Sets
Past work on ascii to ipa symb ol sets was reviewed including the Klatt phonb et Allen et al. [3],
the PHONASCI I system Allen [4], Arpab et used in the rst ARPASpeech Understanding Pro ject
[5], TIMITBET used in lab elling the DARPA Acoustic Phonetic Database whichwas collected by
Texas Instruments and lab elled by MIT [6], the Esprit Sp eech Assessment Metho dology Phonetic
Alphab et SAMPA [7], the Edinburgh Machine Readible Phonetic Alphab et MRPA [8], the Alvey
Pro ject Phonetic alphab et [9], and the SCRIBE pro ject phonetic lab el set SAMPA Extended [10].
These were generally concerned with one or a few Indo-Europ ean Languages, and thus are missing
anumb er of the symb ols needed for other languages. For SAMPA some simpilifying assumptions
were made b ecause it was thought that they would b e used for transcription within one language,
not across languages. This leads to the same symb ol b eing used for quite di erent sounds, most
notably for /r/.
An e ort for world wide ASCI I to IPAcoverage by Ian Maddieson [11] of UCLA was thoughtto
b e to o complicated for the present application. It is a more detailed lab el set aimed at ne phonetic
distinctions in all the world's languages. It do es not distinguish b etween diacritics and baseforms.
With the full set of diacritics in Worldb et it should b e p ossible to have the same level of detail,
with the proviso that phonemes with multiple places of articulation mighthavetohave baseform
symb ols assembled using the Worldb et linking character . A new ascii version for the IPA which has 2
b een develop ed on the sci.lang email news group [12] has also b een examined, but seems to su er
from to o few languages b eing considered in detail. It is supp osed to b e used in email discussions
of phonetics and phonology. It is a full enco ding of the IPA and has some common symb ols with
Worldb et.
PHONASCI I, the ascii IPAsymbol set by George Allen [4] is the closest to having the symb ols
required, but has added symb ols for studying sp eech disorders and child language acquisition. Some
of the constructions of a ricates had the wrong fricativesymb ol and some of the monophthong vowels
had twocharacter symb ols, whichmakes the construction of diphthongs awkward. Otherwise it is a
complete enco ding of the IPAset.
In Worldb et eachIPA baseform symb ol is represented bytwo ASCI I normal characters which
may di er in case giving around 2,900 p ossibilities. Currently we are using 299 symb ols. One of
these characters may b e a space. This xed length baseform allows a unique sp eci cation, and ease
in computer pro cessing. It is envisioned that Worldb et will b e used in computer sp eechsynthesis and
recognition, as well as hand lab elling of sp eech in many languages. It b ecomes very imp ortant in this
context that each symb ol represent only one typ e of sound and that phonemes whichhave di erent
forms across languages have a unique symb ol for each form. Keeping a twocharacter baseform
will in practice prove to b e unattainable for rare phonemes, for example all double articulations
constructed with the linking symbol -. In Worldb et there are some symb ols whichgobeyond the
IPA set in an attempt to attain this goal, particularly for plosives.
Sp eci c Symb ol Assignments
Generally the Worldb et symb ol will b e as close as p ossible to the corresp onding IPAsymb ol. For
the more unusual IPAsymb ols this is more dicult, and is sometimes only true in the most abstract
sense. An e ort has b een made to consistantlyassignvowel and consonantsymb ols. The collection
of phonemes from the UCLA Phonological SegmentInventory Database UPSID was examined to
make sure that Worldb et covered all those used by 10 or more languages listed in Patterns of Sound
by Maddieson [11].
Monophthong vowels will have base symbols whichareusualvowel symb ols orthographically
i.e. a, A, o, O and numb ers 2, 3, 4, 5, 6, 7, and 8. Most of these will b e single printing symb ols.
This is so that the base diphthongs and free and checked vowels can b e constructed of twocharacters.
Since phonologically free and checked vowels are supp osed to di er only in length, and not in sp ectral
prop erties in the various languages e.g.: /i/ in English is free and usually written [i:] phonetically.
But it is a checked vowel in Dutch and written [i] phonetically. the simple addition of the symbol
: for long, allows this distinction to b e represented easily in twocharacters. The numerals 0 and 1
are not used for base symb ols b ecause of their confusibility with the letters o and l.
Diphthongs are constructed by concatenating the two monophthong symb ols which represent the
endp ointvowels. While the present list contains manyofthediphthongs seen in Europ ean languages,
this list is not presently exhaustive. New diphthongs can b e constructed by concatinating the two
vowel symb ols corresp onding to the b eginning and ending vowels of the diphthong.
Consonants can b e twocharacters, since they do not always come in long and short versions.
Many long consonants are actually geminates. It is recommended that geminates b e transcrib ed as
two consonants. Items like nasalization, rhotization and aspiration will b e explicity represented in
the baseform, since they are phonemic in many languages. If it is found that consonant length is
phonemic in some language, then : will b e added as a diacritic, as outlined b elow. See Table 1
for a comparison of IPAtoWorldb et base symb ols, and Table 2 for diacritics and suprasegmentals.
Exhaustive symb ol tables are shown in App endix A. A phoneme set for 12 languages are shown
in App endix B. Presently Danish, Dutch, English, French, German, Hindi, Japanese, Mandarin
Chinese, Russian, Castillian Spanish, and Tamil are listed. 3
CONSONANTS
The Worldb et representation of eachIPA symb ol is written b elow it. IPA symb ols in parentheses are rare
phonemes, for which no machine-readable co ding has yet b een prop osed. In these cases a co ding employing
diacritics is prop osed.
Bi- Labio- Dental Alveolar Post- Retro- Palatal Velar Uvular Pharyn- Glottal
labial dental alveolar ex geal
pb t d tr dr c J k g q Q ?
m M n nr n N Nq
Trill b r r
B r R
Tap or I I I L
t d r
Flap t d r rr
Fricative H f v W s z S c R b c0 x ^ P & g h'
FV f v TD s z S Z sr zr Cj^ x G XK H ! hhv
Lateral 5 7
fricative hl Zl
Approxi- [ M N j ;
mant V[ 9 9r j 4
Lateral l 6 ` l
approx. l lr L Lg
Ejective p' t' U' c' k' q'
stop p> t> t> r c> k> q>
Implosive 2 "
p
Front Central Back
Close i y , X : u
i y ix ux 4 u
i y .y Y
I Y Ix U
Close-mid e $ o
e 7 2 o
@
VOWELS
& ox
Op en-mid A
E 8 ^ >
@ ax
Op en a
a 6 A 5
y= ,is not an approved IPA symbol,butitisinsuch common use that wehave prop ose Ix as the most
natural ASCI I representation for a \centralized i".
TABLE 1: Worldb et ConsonantandVowel Symb ols 4
DIACRITICS
The Worldb et representations are on the rightofeach column.
Voiceless 0 More rounded w ] Labialized w ~ Nasalized
-
n
Voiced v Less rounded / Palatalized j Nasal release n
,
$ l
Advanced + Aspirated h Velarized 2 Lateral release l
+
g
Breathyvoiced Hv Retracted - t No audible release c Pharyngealized !
Creaky voiced ? Centralized x Velarized or Pharyngealized \emphatic"
x
Raised ^ Linguolabial f Velar place g Mid-centralized x
D
m
Lowered / Dental [ s Syllabic = Uvular place q
j
n
Non-syllabic Apical ] Unaspirated * p Advanced Tongue Ro ot >
t
h
}
Laminal g e Glottalized ? o Retracted Tongue Ro ot < Rhoticity/retro exion r
SEGMENTAL TONE
9 'e' extra high tone e
7 e high tone e
5 e mid tone e
Shown on the vowel e.
3 e low tone e
1 `e` extra lowtone e
SUPRASEGMENTALS
Intonation and word-accents can b e represented on a separate tier, using TOBI notation.
q Primary stress `
r Secondary stress ,
k Long :
l Half-long ;
Short `
Extra-short
. Syllable break .
Pause non-IPA
Phrase b oundary non-IPA
OTHER SYMBOLS
\ W ? pj
w w j j
+ jw ! tj
h =j cj
g- k lj
d- O l
cg *
a zg 3
LINKING SYMBOL
linking symbol -
TABLE 2: Worldb et Diacritic and Suprasegmental Symb ols 5
For sp ecial allophones there will b e a diacritic linked to the main symbol by an underscore.
Underscore will not b e used for any other purp ose. Base symb ols with diacritics generally will b e
used for allophonic variations, rather than for phonetic symb ols of regular sp eech segments in the
languages of the world. For example, a labialized allophone of /s/ which is due to a nearby /w/
would b e written [s w] and a rhoticized vowel /i/ will b e i: r. The diacritic symb ols are letters, math
symb ols, numb ers or punctuation symbols. Numb ers with the exception of 2 and 0 as diacritics are
reserved for tone designations in tone languages.
Suprasegmentals are shown in the diacritic table for syllable b oundaries, primary and secondary
stress. Making for intonation and pitch accent is also provided. In tone languages, the tone is
marked byoddnumb er diacritics attached to the vowel on the broad phonetic level, since the tone
is phonemic. Strictly sp eaking the tone is attached to the syllable rather than the vowel, but this
allows a consistent lab el attachmentpoint across all syllables.
Additional Broad Phonetic Symbols beyond IPA
Several symb ols were added to the IPA set whichwere found to b e necessary to accurately
lab el continuous sp eech databases whichhave b een collected over the past 5 years. Some of these
symb ols have b een used by phoneticians in the past, and some of the phenomena which they capture
have b een mentioned in the literature. The ma jor departure is the inclusion of hyp er aspiration for
Indian languages and fricated plosives whichwere found in British English during the lab eling of
the SCRIBE database [10]. Hyp er aspiration is meant to b e a strong aspiration. This may b e the
prop erty called breathyvoiced aspiration in Hindi, but this is yet to b e determined exp erimentally.
It is used to denote strong aspiration, while h denotes ordinary aspiration. Another non-IPA symbol
is the symb ols for individual aps t and d in order to allow the study of duration di erences for
the phonologically voicedorunvoiced aps. In the tables which follow, the IPA column has square
brackets around symb ols which are not in the ocial IPA set, to mark them as pseudo IPAsymb ols.
Lab eling Metho ds Using Worldb et
Worldb et can b e used for three lab eling metho ds. The basic form of Worldb et is just an ACSI I
representation of the IPA. As such it can b e used in exactly the same waythatIPA is used to do
lab eling in which the phonetic and phonological or surface phonemic level are kept separate. The
usual set of IPA diacritics are provided to mark variations.
The second metho d for using Worldb et is as a single level broad phonetic lab eling whichis
based on the phoneme inventory of each language plus diacritics.[14] This allows the recovery of
most of the information in the two level lab eling scheme from one level of lab eling, by stripping the
diacritics from the broad phonetic lab eling to collapse all allophones intothebasesymb ol. Since
it takes almost twice as long to lab el twolevels as one, the single level scheme saves considerable
time and e ort. Not all of the allophones are within the class of the broad phonetic lab el, so that
not all of the surface phonemes are pro duced by diacritic stripping. However some 97 of them
are based on exp erience with the British English SCRIBE database, and for practical applications
this is enough to warrent using one level of lab eling.
The phoneme is by de nition language sp eci c, denoting the minimal set of phonological ele-
ments which is sucient to represent a sp eci c language. Once the phoneme inventory for a language
is determined, a set of baseforms are constructed by taking the base symbol from Worldb et and con-
catinating diacritic symb ols without the underscore. For example in Hindi there are unaspirated
and heavily aspirated stop consonants, therefore these would b e representedasptkb dgpHtH
kH bH dH gH. In American English aspiration is not phonemic but o ccurs as the default in voiceless
stop consonants so the lab els would b e ph th kh b d g. The broad phonetic inventory of lab els which
are phonemically based is thus constructed for each language.
Sp eech is lab eled at a broad phonetic level using these phoneme based symb ols with diacritics
to denote broad phonetic variations. For example, an aspirated d in English would b e lab eled d h.
When the lab eling is nished, stripping the diacritics will then recover the surface phonemic lab els
for most cases. The cases in which this metho d fails is when the allophonic variant is in a di erent
manner or place of articulation distant from the default phoneme. In Japanese, the phoneme /h/
has variants /f/ and /C/, which are suciently di erent from /h/ that they cannot b e prop erly 6
constucted from the baseform /h/. For Japanese these allophonic variants o ccur in a very sp eci c
context and the phonemic level can b e corrected with rules after the diacritic stripping. With the
di erence b etween phonemic and allophonic variants lab eled it would b e p ossible to determine if b h
and bh are similar across languages or whether phonemic distinctions are more carefully pro duced
and therefore aspiration in bh is more consistent in strength and length.
The third lab eling metho d is called acoustic-phonetic segment lab eling, which while using the
phoneme based lab els, attempts to lab el the actual regions of allophonic pro duction. The part of
avoiced fricative whichisdevoiced, z for example, with a separate diacritic, z 0toshow which
section is voiced and which one is not. Similarly for labialized fricatives, the part during which the
labialization is seen to have an e ect is lab eled s w and the non-labialized part is lab eled s. This
detailed di erentiation of voicing and coarticulation e ects allows studies of the overlap of phonetic
features and the study of durations of these segments. Often phonologically voiceless plosives liket
are completely voiced in intervo calic p osition. Normally a voiced consonantwould b e shorter than
a long one. Do es the sp eech pro duction mechanism require that the voiced voiceless consonantbe
longer? This can only b e answered by studying sp eech lab eled with this level of detail captured in the
lab els. The acoustic-phonetic segment lab eling should corresp ond to multi-level lab eling used in non-
linear phonology.For example the nasalization of a vowel would b e lab eled as to its onset time and
so the tier in non-linear phonology corresp onding to nasalization would include the nasal consonant
and the nasalize p ortion of the vowel. This allows the study of the direction of feature spreading,
and the limits on duration and overlap which form the parameters of the sp eech pro duction pro cess.
Acoustic-phonetic lab eling can b e done on a single tier, and the broad phonetic lab els obtained by
diacritic clumping. All of the lab els with the same adjacent lab el are examined, the diacritic with
the longest duration can b e applied to the whole segment and the other diacritics removed. Finally
the phonemic level can b e recovered by stripping o all of the diacritics, and retaining the baseform
symb ols.
Phonological Inventories
Phonemes are, strictly sp eaking, de ned only within a single language. The set of all observed
phonemes from all the world's languages can b e studied. Sets of these phonological elements across all
languages are called phonological inventories. These have b een collected by Greenberg and Ferguson
at Stanford University and by Maddieson amd Ladefoged at UCLA [11]. The aim of Worldb et is
to include as a baseform all of the elements in the phonological inventory of all the languages of
the world. These symb ols are created by concatinating the Worldb et ascii IPA set and diacritics
to make a phoneme set for each language. Because these symb ols are constructed in a principled
fashion, it is always easy to see what a Worldb et symb ol represents within the IPA equivalentset
from which it is constructed.
Sp eech from many languages lab eled with Worldb et symb ols will allow the study of the detailed
characteristics of similar phonemes in di erent languages. Studies have b een done in the past
comparing the same phonological elements, for example nasals in French and English. The existence
of multilanguage sp eech databases lab eled with Worldb et will allow quanti cation of the di erence
between phonemes across a sp ectrum of languages. Even sp eech databases which are not lab eled in
Worldb et can have their phonetic lab el set mapp ed to Worldb et in order to simplify the pro cess of
cross language studies. 7
References
^
1. International Phonetic Asso ciation. 1888. fonetik titcr,
August, 1888.
2. International Phonetic Asso ciation. 1989. Rep ort on the 1989 Kiel Convention.
Journal of the International Phonetic Asso ciation 19:2, 67 - 80.
3. J. Allen, M. S. Hunnicutt and D. Klatt, "Klatt Symb ols",
From text to speech: The MITalk System Cambridge University Press, 1987, p. 197.
4. G. D. Allen 1988, "The PHONASCI I system," J. IPA, 18:1, pp. 9-25.
5. Reference to arpab et
6. S. Sene and V. W. Zue, "Transcription and Alignmentofthe
TIMIT Database," DARPA TIMIT CD-ROM Do cumentation, 1988.
7. J. C. Wells, Computer-co ded phonemic notation of individual languages
of the Europ ean Community,J.IPA, 19, pp. 32-54 1989.
8. J. Laver, M. Alexander, C. Bennett, I. Cohen, D. Davies, and J. Dalby,
"Sp eech Segmentation Criteria for the ATR/CSTR Database, Progress Rep ort Numb er 1, 1988.
9. A. W. Bladon and J. C. Wells, "The Alvey Phonetic Alphab et," Pro ceedings
of the IEE Conference on Sp eech Input and Output, London 1986.
10. J. Hieronymus, M. Alexander, C. Bennett, I. Cohen, D. Davies, J. Dalby,J.Laver, W. Barry,
A. Fourcin and J. Wells, "Sp eech Segmentation Criteria for the SCRIBE Pro ject,"
SCRIBE Pro ject Do cumentation 1990 available from Sp eech Research Unit, DRA,
St. Andrews Road, Gr. Malvern, UK.
11. I. Maddieson, Patterns of Sound,Cambridge University Press, Cambridge, 1984.
12. E. Kirshenbaum, Second Draft, Article 8167 of sci.lang, Jan. 1993
13. C. A. Ferguson, and J. H. Greenb erg, The Stanford Phonlogy Archive.
14. W. J. Berry and A. J. Fourcin, Computer Sp eech and Language, 6,1-4 1992. 8
App endix A: Worldb et Symbols
VOWELS
Monophthongs
IPA WORLD TIMIT SAMPA JPO EXAMPLE
i: i: iy i: E beathigh front long English
i i - i - vierhigh front short Dutch
~
i i - - - nasalized high front
y y - y - turounded high front French
i I ih I i bitmid-high mid-front short English
~i I - - - nasalized I
i: I: - I: - long i
y Y - Y - funfrounded i German
[.] Ix ix [ - centralized i
[,] ix - - - high front mid unrounded Russian
[~.] Ix - - - nasalized centralized i
e e - - - th emid-high front French
~e e - - - nasalized e
7 - 2 - blodmid-high front rounded German
E eh e e betmid-low front short English
~ E - E - vinnasalized E French
8 - 9 - plotzlichmid-lowfront rounded German
~ 8 - 9 - brunnasalized 8 French
@ ae f a batmid-lowfront long English
~ @ - - - rounded @
ax - 6 - b esse rmid-lowcentral German
~ ax - - - nasalized ax
a a - a - pattelow front French
6 - & - rounded op en front
A aa A: oand@ Boblowback English
~ A - A - ventnasalized A French
5 - Q - potlowback rounded Br. English
V V
ah V butmid-low back English
A > ao O: > boughtmid-low mid-back English
V
~A > - O - bonnasalized French
$ 2 - - - geunrounded o Chinese
o o - o - lungemid-high back Danish
~o o - - - nasalized o
$ 2 - - - geunrounded o Chinese
Y U uh U u bookmid-high back short English
~Y U - - - nasalized U
u u uw u: U boothigh back English
u~ u - - - nasalized u
X ux ux g - suitfronted u English
high back unrounded Japanese : 4 - - - kutsu
:~ 4 - - - nasalized 4
& ax @ & ab outcentral short English
~ & - - - nasalized &
@ ox - - - rounded
3 - @: - birdcentral long Br. English
&0 ax-h @a - voiceless Japanese
3r er 3r R birdcentral retro exed am.english
central retro exed short Am. English &r axr =r - butter
:} 4r - - - retro exed : Chinese shi p o e
App endix A: Worldb et Symb ols 9 Monophthongs
Diphthongs
IPA WORLD TIMIT SAMPA JPO EXAMPLE
ia ia - - - - Saek
ie ie - - - - Armenian
i i5 - iQ - kirke Danish
i iax - - - Tier German
y yax - - - Tur German
7ax - - - Gehor German
Eax - - - Gewahr German
e eax - - - Gewehr German
io io - - - - Kurdish
iu iu - iu - ivrig Dutch
i: i4 - - - - Japanese
i i& - I@ - here Br. English
yu yu - yu - syv Danish
ie ie - - - - Gilyak
ei ei ey eI A bait Am. English
ei eI - - - bait Br. English
eo eo - eo -
eu eu - eu - peber Danish
e e& - e@ - there Br. English
i Ei - - - jn Dutch
E5 - EQ - br Danish
u Eu - Eu - evne Danish
@5 - f Q - r Danish
u @u - f u - stvle Danish
Ai >i oy - Y boy Am. English
Ai >I - OI - boy Irish English
Ay >Y - OY - Kreuz German
A >& - O@ - pore Br. English
y 8y - 2y - huis Dutch
u 8u - 2u - vle Danish
ai ai - - - hegn Dutch
ai aI ay aI I buy English
ae ae - - -
e eax - - - Gewehr German
aY aU aw aU W down English
a:i a4 - - -
ao ao - - -
au au - - -
u Au - Au - goud Dutch
oi oi - - - noi Italian
oi oI - - - boy Irish English
oY oU ow @U O show English
oa oa - - -
o oax - - - Tor German
oY oU - - - gou Mandarin
ou ou - - -
o o5 - oQ - morsom Danish
ui ui - ui - huj Danish
ua ua - - -
u uax - - - Ruhr German
u u5 - uQ - hurtig Danish
App endix A: Worldb et Symb ols 10 Diphthongs
uo uo - - -
u u& - U@ - poor Br. English
:i 4i - - -
:a 4a - - -
u &u - - -
Approximants and Trills
IPA WORLD TIMIT SAMPA JPO EXAMPLE
[ V[ - - - labio dental approximant English
j j j j y youpalatal approximant English
+ jw - H - juinvoiced labial-palatal approximant French
w w w w w witvoiced labial velar approximant English
\ W - W - whenvoiceless w Scottish
M 9 r r r rent British Eng
N 9r - - - rentretro exed approximant Am. Englis h
; 4 - - - velar approximant
b B - - - labial trill
r R - - - uvular trill
r r - rr - alveolar trill
L rr - - - alveolar retro exed tap
I r - - - r ap - not retro exed
r
P K - - - uvular fricativerond French
l l l l l letalveolar lateral approximant
l Lg - - - velar lateral approximant
` L - L - palatal lateral approximant
6 lr - - - retro exed lateral approximant
5 hl - - - alveolar lateral fricative
7 Zl - - - alveolar lateral voiced fricative
App endix A: Worldb et Symb ols 11 Approximants and Trills
CONSONANTS
Plosives
IPA WORLD TIMIT SAMPA JPO EXAMPLE
pt pc pcl pc sp stopp closure English
p p p p p p otholderlabial voiceless plosive release English
[p*] p* - p! - p burst without aspiration
f
[p ] pP - pf - fricated p Irish Engl
p ph - - - p aspirated release
H
[p ] pH - - - p hyp er-aspirated release Hindi
r
p pr - - - p retro ex release
rH
[p ] pR - - - p hyp er-aspirated retro ex release Hindi
n
p pn - - - p nasal release Hindi
[p ] pN - - - p hyp er-aspirated nasal release Hindi
p' p> - - - p ejective release
}p p< - - - p implosive
tt t[ c - - - t[ with no release English
j
t t[ - - - dental voiceless plosive release
j
[t*] t[ * - - - t[ burst without aspiration
j
t t[ h - - - t[ aspirated release
j
n
t t[ n - - - t[ nasal release
j
t' t[ > - - - t[ ejective release
j
}t t[ < - - - t[ implosive
j
tt tc tcl tc st streetcart with no release English
t t t t t apico-alveolar voiceless plosive release English
[t*] t* - t! - t burst without aspiration
f
[t ] tT - tf - fricated t Irish Engl
[I ] t dx t< - ap or tap t Am. Engl
t
t th - - - t aspirated release
H
[t ] tH - - - t hyp er-aspirated release Hindi
U tr - - - t retro ex release
H
[U ] tR - - - t hyp er-aspirated retro ex release Hindi
n
t tn - - - t nasal release
nH
[t ] tN - - - t hyp er-aspirated nasal release Hindi
t' t> - - - t ejective release
}t t< - - - t implosive
ct cc - - - c with no release
c c - - - palatal voiceless plosive release
[c*] c* - - - c burst without aspiration
c ch - - - c aspirated release
cpt cp c - - - cp with no release
cp cp - - - labial palatal voiceless plosive release
[cp*] cp * - - - cp burst without aspiration
cp cp h - - - cp aspirated release
kt kc kcl - sk clo ckworkk with no release English
k k k k k cotvelar voiceless plosive release English
[k*] k* - k! - k burst without aspiration
f
] kK - kf - fricated k Irish Eng [k l
k kh - - - k aspirated release
n
k kn - - - k nasal release
H
[k ] kH - - - k hyp er-aspirated k release Hindi
rH
[k ] kR - - - k hyp er-aspirated retro ex release Hindi
App endix A: Worldb et Symb ols 12 Plosives
nH
[k ] kN - - - k hyp er-aspirated nasal release Hindi
k' k> - - - k ejective release
2 k< - - - k implosive
kpt kp c - - - kp with no release
kp kp - - - labial velar voiceless plosive release
[kp*] kp * - - - kp burst without aspiration
kp kp h - - - kp aspirated release
qt qc - - - q with no release English
q q - - - uvular voiceless plosive release English
[q*] q* - - - q burst without aspiration English
q qh - - - q aspirated release
2 q< - - - q implosive
bt bc bcl bcv sb b obtailb with no release English
b b b b b b oatlabial voiced plosive release English
[b*] b* - b! - b burst without aspiration
f
[b ] bB - bf - fricated b Irish Engl
b bh - bh - b aspirated release
H
[b ] bH - - - b hyp er-aspirated release Hindi
r
b br - - - b retro ex release
rH
[b ] bR - - - b hyp er-aspirated retro ex release Hindi
n
b bn - - - b nasal release
nH
[b ] bN - - - b hyp er-aspirated nasal release Hindi
b' b> - - - b ejective release
b< - - - b implosive
[I ] b - - - ap or tap b
b
dt d[ c - - - d[ closure English
j
d d[ - - - dental voiced plosive release English
j
[d*] d[ * - - - d[ burst without aspiration
j
d d[ h - - - d[ aspirated release
j
}d d[ < - - - d[ implosive
j
dt dc dcl dcv sd bloodclotd closure English
d d d d d do ckd release English
[d*] d* - d! - d burst without aspiration
f
[d ] dD - df - fricated d Irish Engl
[I ] d dx d< - ap or tap d Am. English
d
d dh - - - d aspirated release
H
[d ] dH - - - d hyp er-aspirated release Hindi
dr - - - d retro ex release
rH
[d ] dR - - - d hyp er-aspirated retro ex release Hindi
n
d dn - - - d nasal release
nH
[d ] dN - - - d hyp er-aspirated nasal release Hindi
d' d> - - - d ejective release
d< - - - d implosive
1t Jc - - - J with no release
1 J - - - palatal voiced plosive release
[1*] J* - - - J burst without aspiration
1 Jh - - - J aspirated release
c - - - Jb with no release 1bt Jb
1b Jb - - - labial palatal voiced plosive release
* - - - Jb burst without aspiration [1b*] Jb
1b Jb h - - - Jb aspirated release
gt gc gcl sg g gameg with no release English
g g g g g gamevelar voiced plosive release English
App endix A: Worldb et Symb ols 13 Plosives
[g*] g* - g! - g burst without aspiration
[gf ] gG - gf - fricated g Irish Engl
g gh - - - g aspirated release
H
[g ] gH - - - g hyp er-aspirated release Hindi
r
g gr - - - g retro ex release
rH
[g ] gR - - - g hyp er-aspirated retro ex release Hindi
n
g gn - - - g nasal release
nH
[g ] gN - - - g hyp er-aspirated nasal release Hindi
g' g> - - - g ejective
" g< - - - g implosive
[I ] g - - - ap or tap g
g
gbt gb c - - - gb with no release
gb gb - - - labial velar voiced plosive release
[gb*] gb * - - - gb burst without aspiration
gb gb h - - - gb aspirated release
Gt Q - - - Q with no release
G Q - - - uvular voiced plosive release
[G*] Q* - - - Q burst without aspiration
G Qh - - - Q aspirated release
G Q< - - - Q implosive
d- ?H - - - epiglottal plosive
d ? q ? * glottal stop
App endix A: Worldb et Symb ols 14 Plosives
Flaps and Taps
IPA WORLD TIMIT SAMPA JPO EXAMPLE
[I ] b - - - puraib e:tonovoiced labial ap Japanese
b
[I ] d dx d< - laddervoiced alveolar ap English
d
[I ] g - - - voiced velar ap Japanese
g
[I ] t dx t< - battervoiceless alveolar ap English
t
[I ] k - - - voiceless velar ap Japanese
k
~I n nx - - banter English
O l - - - alveolar lateral ap
L rr - - - retro ex ap
I r - - - r ap - not retro exed
r
Fricatives
IPA WORLD TIMIT SAMPA JPO EXAMPLE
H F - - - labial voiceless fricative
f f f f f fatlabio-dental voiceless fricative English
W T th T T thingapico-alveolar voiceless fricative English
s s s s s satalveolar voiceless fricative English
R sr - - - retro exed alveolar fricative
S S sh S S shutp ostalveolar voiceless fricative English
cg - - - alveolo-palatal voiceless fricative
5 hl - - - alveolar lateral voiceless fricative
V - B - labial voiced fricative
v v v v v vatlabio dental voiced fricative English
D dh D D thatdental voiced fricative English
z z z z z zo oalveolar voiced fricative English
b zr - - - retro exed voiced p ost-alveolar fricative
c Z zh Z zh azurep ostalveolar voiced fricative English
7 Zl - - - alveolar lateral voiced fricative
a zg - - - alveolo-palatal voiceless fricative
c C - C - ichpalatal voicelessfricative German
0 j- - - - palatal voiced fricative
x x - - - achvelar voiceless fricative German
G - G - kogevelar voiced fricative Danish
^ X - - - uvular voiceless fricative
P K - - - ronduvular voiced fricative French
h h hh - - headglottal voiceless fricativeh English
' hv hv hv - glottal voiced fricativeh
& HH - - - pharyngeal voiceless fricative
g HH v - - - pharyngeal voiced fricative
h H - - - epiglottal voiceless fricative
g- H v - - - voiced epiglottal fricative
App endix A: Worldb et Symb ols 15 Taps and Fricatives
A ricates
IPA WORLD TIMIT SAMPA JPO EXAMPLE
pHt pF c pcl - - bilabial voiceless a ricate closure
pH pF - - - pF a ricate burst frication
b t bVc bcl - - voiced bilabial a ricate closure
b bV - - - bV a ricate burst frication
pft pf c pcl pfc - pf a ricate closure
pf pf - pf - Pfahl German
bvt bv c bcl - - voiced labio dental a ricate closure
bv bv - - - bv a ricate closure
tst ts c tcl tsc - ts closure
ts ts - ts - Zahl German
tRt ts r c - - - alveolar retro ex a ricate closure Mandarin
tR ts r - - - - Mandarin
dzt dz c dcl - - voiced apicoalveolar a ricate closure
dz dz - - - dz burst frication
tSt tS c tcl tSc - tS closure
tS tS ch tS C church English
tS: tS: - - - chatriumbrella Hindi
dct dZ c dcl dZvc - dZ closure
dc dZ jh dZ J judge English
dc: dZ : - - - jharistream Hindi
cct cC c - - - palatal voiceless a ricate closure
cc cC - - - cC burst frication
10t Jj c - - - palatal voiced a ricate closure
10 Jj - - - Jj burst frication
kxt kx c - - - velar voiceless a ricate closure
kx kx - - - velar voiceless a ricate
c - - - velar voiced a ricate closure gt gG
g gG - - - velar voiced a ricate
q^t qX c - - - uvular voiceless a ricate closure
q^ qX - - - uvular voiceless a ricate
g^t QK c - - - uvular voiced a ricate closure
g^ QK - - - uvular voiced a ricate
Clicks
IPA WORLD TIMIT SAMPA JPO EXAMPLE
? pj - - - bilabial click
j j - - - dental click
! tj - - - alveolar click
jj lj - - - alveolar lateral click
=j cj - - - palatal click
App endix A: Worldb et Symb ols 16 A ricates and Clicks
Nasals
IPA WORLD TIMIT SAMPA JPO EXAMPLE
m m m m m metlabial nasal English
M M - - - labio dental nasal
n n[ - - - dental nasal
j
n n n n n netalveolar nasal English
~I n nx - - nasal ap
> nr - - - retro exed nasal
j
n nj - - - oignonpalatalized alveolar nasal French
< n - J - canos ~ palatal nasal Spanish
= N ng N ng singvelar nasal English
N Nq - - - uvular nasal
Syllabics
IPA WORLD TIMIT SAMPA JPO EXAMPLE
l l= el =l ll battle English
.
ms m= em =m mm b ottom English
ns n= en =n nn button English
s= N= eng =ng - camping French
Silences and Pauses
IPA WORLD TIMIT SAMPA JPO EXAMPLE
- + .epi + * ep enthetic silence
- .pau * [pause]
- h [b egin/end]
Non-sp eech Sounds
IPA WORLD TIMIT SAMPA JPO EXAMPLE
- .ls .ls - - lip smack
- .br .br - - breath
- .ct .ct - - clear throat
- .laugh .laugh - - laughter
- .tc .tc - - tongue click
- .cough .cough - - cough
- .sneeze .sneeze - - sneeze
- .ln .ln - - telephone line noise
- .ns .ns - - non-sp eechnoise
- .bn .bn - - background noise
- .ring .ring - - telephone ring
App endix A: Worldb et Symb ols 17 Nasals and Syllabics
Worldb et Charts:
CONSONANTS
Manner of Articulation
Place Stop Nas Fricative A ricate Appr Lat Flap Tril Glottal