Phoneme Inventory

kokorog2p uses the Kokoro TTS phoneme inventory, which is based on IPA (International Phonetic Alphabet).

Kokoro Phoneme Set

The Kokoro phoneme set consists of 45 phonemes plus punctuation markers.

US English Vowels

Monophthongs

IPA

Name

Example Word

Phonemes

æ

TRAP

cat

kˈæt

ɑ

LOT/PALM

father

fˈɑðɚ

ə

schwa

about

əbˈaʊt

ɚ

r-colored schwa

butter

bˈʌtɚ

ɛ

DRESS

bed

bˈɛd

ɪ

KIT

bit

bˈɪt

i

FLEECE

beat

bˈit

ʊ

FOOT

book

bˈʊk

u

GOOSE

boot

bˈut

ʌ

STRUT

cut

kˈʌt

ɔ

THOUGHT

caught

kˈɔt

Diphthongs

IPA

Name

Example Word

Phonemes

PRICE

buy

bˈaɪ

MOUTH

cow

kˈaʊ

FACE

day

dˈeɪ

GOAT

go

ɡˈoʊ

ɔɪ

CHOICE

boy

bˈɔɪ

Special Vowels

Symbol

Name

Example Word

Phonemes

O

GOAT (Kokoro)

go

ɡˈO

Note: O is a Kokoro-specific simplification of the GOAT vowel, used instead of in some contexts.

Consonants

Stops

IPA

Name

Example Word

Phonemes

p

voiceless bilabial

pat

pˈæt

b

voiced bilabial

bat

bˈæt

t

voiceless alveolar

tap

tˈæp

d

voiced alveolar

dad

dˈæd

k

voiceless velar

cat

kˈæt

ɡ

voiced velar

gap

ɡˈæp

Fricatives

IPA

Name

Example Word

Phonemes

f

voiceless labiodental

fat

fˈæt

v

voiced labiodental

vat

vˈæt

θ

voiceless dental

thin

θˈɪn

ð

voiced dental

this

ðˈɪs

s

voiceless alveolar

sip

sˈɪp

z

voiced alveolar

zip

zˈɪp

ʃ

voiceless postalveolar

ship

ʃˈɪp

ʒ

voiced postalveolar

measure

mˈɛʒɚ

h

glottal

hat

hˈæt

Affricates

IPA

Name

Example Word

Phonemes

voiceless postalveolar

church

tʃˈɚtʃ

voiced postalveolar

judge

dʒˈʌdʒ

Nasals

IPA

Name

Example Word

Phonemes

m

bilabial

map

mˈæp

n

alveolar

nap

nˈæp

ŋ

velar

sing

sˈɪŋ

Liquids

IPA

Name

Example Word

Phonemes

l

lateral

lap

lˈæp

ɹ

approximant

rap

ɹˈæp

Glides

IPA

Name

Example Word

Phonemes

w

labial-velar

wap

wˈæp

j

palatal

yap

jˈæp

Suprasegmentals

Symbol

Name

Usage

ˈ

Primary stress

Placed before the stressed syllable: **ˈ**bæt

ˌ

Secondary stress

Placed before secondarily stressed syllable: ˌɹɛkəɡnˈɪʃən

Punctuation

Kokoro supports the following punctuation marks:

;  :  ,  .  !  ?  -

Other punctuation is typically normalized or removed.

British English Differences

British English uses most of the same phonemes, with some key differences:

Phoneme

US Example

GB Example

Difference

ɑ → ɒ

lˈɑt (lot)

lˈɒt (lot)

LOT vowel

ɔ → ɒ

kˈɔt (caught)

kˈɒt (caught)

THOUGHT vowel

ɚ → ə

bˈʌtɚ (butter)

bˈʌtə (butter)

Non-rhotic

oʊ → əʊ

ɡˈoʊ (go)

ɡˈəʊ (go)

GOAT vowel

German Phonemes

German adds several phonemes not found in English:

Vowels

IPA

Name

Example Word

Notes

y

close front rounded

Tür

German ü (long)

ʏ

near-close front rounded

Hütte

German ü (short)

ø

close-mid front rounded

schön

German ö (long)

œ

open-mid front rounded

Köln

German ö (short)

long a

Vater

Length distinction

long e

gehen

Length distinction

Consonants

IPA

Name

Example Word

Notes

ç

voiceless palatal fricative

ich

ich-Laut

x

voiceless velar fricative

ach

ach-Laut

ʁ

uvular fricative

rot

German r

p͡f

voiceless labiodental affricate

Pferd

pf sound

t͡s

voiceless alveolar affricate

Zeit

z sound

French Phonemes

French includes:

IPA

Name

Example Word

Notes

ɛ̃

nasalized open-mid front

vin

Nasal vowel

ɑ̃

nasalized open back

sans

Nasal vowel

ɔ̃

nasalized open-mid back

bon

Nasal vowel

œ̃

nasalized open-mid front rounded

un

Nasal vowel

ʁ

uvular fricative

rue

French r

Czech Phonemes

Czech includes:

IPA

Name

Example Word

Notes

raised alveolar trill

řeka

ř sound

ɟ

voiced palatal stop

ďábel

Palatalized d

c

voiceless palatal stop

ťava

Palatalized t

ɲ

palatal nasal

nic

Palatalized n

ɦ

voiced glottal fricative

hrad

Voiced h

Working with Phonemes

Validation

from kokorog2p import validate_phonemes

# Valid Kokoro phonemes
assert validate_phonemes("hˈɛlO wˈɜɹld")

# Invalid phonemes
assert not validate_phonemes("xyz123")

Getting Vocabulary

from kokorog2p import get_vocab

# Get US English vocabulary
us_vocab = get_vocab("us")
print(f"US vocabulary: {len(us_vocab)} phonemes")

# Get GB English vocabulary
gb_vocab = get_vocab("gb")
print(f"GB vocabulary: {len(gb_vocab)} phonemes")

# Print all phonemes
for phoneme in us_vocab:
    print(phoneme)

Conversion Between Formats

espeak to Kokoro

from kokorog2p import from_espeak

# Convert espeak IPA to Kokoro format
espeak_ipa = "həlˈəʊ"
kokoro = from_espeak(espeak_ipa, variant="gb")
print(kokoro)  # hˈɛlO or similar

Kokoro to espeak

from kokorog2p import to_espeak

# Convert Kokoro to espeak IPA
kokoro = "hˈɛlO"
espeak = to_espeak(kokoro, variant="us")
print(espeak)

References