77 lines
3.8 KiB
Plaintext
77 lines
3.8 KiB
Plaintext
The Carnegie Mellon Pronouncing Dictionary [cmudict.0.7a]
|
|
|
|
ftp://ftp.cs.cmu.edu/project/speech/dict/
|
|
https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/cmudict/cmudict.0.7a
|
|
|
|
Copyright (C) 1993-2008 Carnegie Mellon University. All rights reserved.
|
|
|
|
File Format: Each line consists of an uppercased word,
|
|
a counter (for alternative pronunciations), and a transcription.
|
|
Vowels are marked for stress (1=primary, 2=secondary, 0=no stress).
|
|
E.g.: NATURAL 1 N AE1 CH ER0 AH0 L
|
|
|
|
The dictionary contains 127069 entries. Of these, 119400 words are assigned
|
|
a unique pronunciation, 6830 words have two pronunciations, and 839 words have
|
|
three or more pronunciations. Many of these are fast-speech variants.
|
|
|
|
Phonemes: There are 39 phonemes, as shown below:
|
|
|
|
Phoneme Example Translation Phoneme Example Translation
|
|
------- ------- ----------- ------- ------- -----------
|
|
AA odd AA D AE at AE T
|
|
AH hut HH AH T AO ought AO T
|
|
AW cow K AW AY hide HH AY D
|
|
B be B IY CH cheese CH IY Z
|
|
D dee D IY DH thee DH IY
|
|
EH Ed EH D ER hurt HH ER T
|
|
EY ate EY T F fee F IY
|
|
G green G R IY N HH he HH IY
|
|
IH it IH T IY eat IY T
|
|
JH gee JH IY K key K IY
|
|
L lee L IY M me M IY
|
|
N knee N IY NG ping P IH NG
|
|
OW oat OW T OY toy T OY
|
|
P pee P IY R read R IY D
|
|
S sea S IY SH she SH IY
|
|
T tea T IY TH theta TH EY T AH
|
|
UH hood HH UH D UW two T UW
|
|
V vee V IY W we W IY
|
|
Y yield Y IY L D Z zee Z IY
|
|
ZH seizure S IY ZH ER
|
|
|
|
(For NLTK, entries have been sorted so that, e.g. FIRE 1 and FIRE 2
|
|
are contiguous, and not separated by FIRE'S 1.)
|
|
|
|
Redistribution and use in source and binary forms, with or without
|
|
modification, are permitted provided that the following conditions
|
|
are met:
|
|
|
|
1. Redistributions of source code must retain the above copyright
|
|
notice, this list of conditions and the following disclaimer.
|
|
The contents of this file are deemed to be source code.
|
|
|
|
2. Redistributions in binary form must reproduce the above copyright
|
|
notice, this list of conditions and the following disclaimer in
|
|
the documentation and/or other materials provided with the
|
|
distribution.
|
|
|
|
This work was supported in part by funding from the Defense Advanced
|
|
Research Projects Agency, the Office of Naval Research and the National
|
|
Science Foundation of the United States of America, and by member
|
|
companies of the Carnegie Mellon Sphinx Speech Consortium. We acknowledge
|
|
the contributions of many volunteers to the expansion and improvement of
|
|
this dictionary.
|
|
|
|
THIS SOFTWARE IS PROVIDED BY CARNEGIE MELLON UNIVERSITY ``AS IS'' AND
|
|
ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
|
|
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
|
PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL CARNEGIE MELLON UNIVERSITY
|
|
NOR ITS EMPLOYEES BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
|
|
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
|
|
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
|
|
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
|
|
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
|
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
|
|
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
|