Backends API
kokorog2p supports multiple phonemization backends.
espeak-ng Backend
- class kokorog2p.backends.espeak.EspeakBackend(language: str = 'en-us', with_stress: bool = True, tie: str = '^', use_cli: bool = False)[source]
Bases:
objectHigh-level espeak backend for Kokoro TTS phonemization.
This class provides a simple interface for converting text to phonemes using espeak-ng. It automatically converts espeak’s IPA output to Kokoro’s phoneme format.
- Example:
>>> backend = EspeakBackend("en-us") >>> backend.phonemize("hello world") 'hˈɛlO wˈɜɹld'
- __init__(language: str = 'en-us', with_stress: bool = True, tie: str = '^', use_cli: bool = False) None[source]
Initialize the espeak backend.
- Args:
language: Language code (e.g., “en-us”, “en-gb”, “fr-fr”). with_stress: Whether to include stress markers in output. tie: Tie character mode. “^” uses tie character for affricates. use_cli: If True, force use of CLI phonemizer instead of library.
- property wrapper: EspeakPhonemizerBase
Get the underlying Phonemizer instance (lazy initialization).
- remove_punctuation(text: str) str[source]
Remove punctuation from text before phonemization.
Preserves: - Hyphens between letters: “my-world” → “my-world” - Apostrophes between letters: “don’t” → “don’t” - Single periods between letters: “Dr. Smith” → “Dr. Smith”
Removes: - Quote marks around words: “‘Hello’” → “Hello” - Repeated punctuation: “Hello??” → “Hello?” - Standalone punctuation: “!” → “”, “?” → “” - Standalone dots: “..” → “” - Ellipsis sequences: “I like this … . Hello.” → “I like this. Hello.” - Special sequences: “I dont’t like you.!” → “I dont’t like you.”
Enforces: - Single punctuation between words - Space after punctuation: “Hello,world” → “Hello, world”
Preserves special symbols: @, #, etc.
- Args:
text: Input text.
- Returns:
Text with punctuation cleaned.
- phonemize(text: str, convert_to_kokoro: bool = True, remove_punctuation: bool = True) str[source]
Convert text to phonemes.
- Args:
text: Text to convert to phonemes. convert_to_kokoro: If True, convert espeak IPA to Kokoro format.
If False, return raw espeak IPA output.
remove_punctuation: If True, remove punctuation before phonemization.
- Returns:
Phoneme string.
- phonemize_list(texts: list[str], convert_to_kokoro: bool = True) list[str][source]
Convert multiple texts to phonemes.
- Args:
texts: List of texts to convert. convert_to_kokoro: If True, convert to Kokoro format.
- Returns:
List of phoneme strings.
- class kokorog2p.espeak_g2p.EspeakOnlyG2P(language: str = 'en-us', use_espeak_fallback: bool = True, use_goruut_fallback: bool = False, strict: bool = True, version: str = '1.0', **kwargs)[source]
Bases:
G2PBaseG2P implementation using only espeak-ng.
This is used for languages that don’t have dedicated dictionaries or custom G2P logic. It provides basic phonemization via espeak.
- Example:
>>> g2p = EspeakOnlyG2P("fr-fr") >>> tokens = g2p("Bonjour le monde")
- VOICE_MAP = {'ar': 'ar', 'ar-sa': 'ar', 'bn': 'bn', 'bn-in': 'bn', 'cs': 'cs', 'cs-cz': 'cs', 'da': 'da', 'da-dk': 'da', 'de': 'de', 'de-de': 'de', 'el': 'el', 'el-gr': 'el', 'es': 'es', 'es-es': 'es', 'fa': 'fa', 'fa-ir': 'fa', 'fi': 'fi', 'fi-fi': 'fi', 'fr': 'fr-fr', 'fr-fr': 'fr-fr', 'he': 'he', 'he-il': 'he', 'hi': 'hi', 'hi-in': 'hi', 'hu': 'hu', 'hu-hu': 'hu', 'id': 'id', 'id-id': 'id', 'it': 'it', 'it-it': 'it', 'ms': 'ms', 'ms-my': 'ms', 'nb': 'nb', 'nb-no': 'nb', 'nl': 'nl', 'nl-nl': 'nl', 'no': 'nb', 'pl': 'pl', 'pl-pl': 'pl', 'pt': 'pt', 'pt-br': 'pt-br', 'pt-pt': 'pt', 'ro': 'ro', 'ro-ro': 'ro', 'ru': 'ru', 'ru-ru': 'ru', 'sv': 'sv', 'sv-se': 'sv', 'ta': 'ta', 'ta-in': 'ta', 'th': 'th', 'th-th': 'th', 'tr': 'tr', 'tr-tr': 'tr', 'uk': 'uk', 'uk-ua': 'uk', 'vi': 'vi', 'vi-vn': 'vi'}
- __init__(language: str = 'en-us', use_espeak_fallback: bool = True, use_goruut_fallback: bool = False, strict: bool = True, version: str = '1.0', **kwargs) None[source]
Initialize the espeak-only G2P.
- Args:
language: Language code (e.g., ‘fr-fr’, ‘de-de’). use_espeak_fallback: Ignored (always uses espeak). strict: If True (default), raise exceptions on errors. If False,
log warnings and return empty results for backward compatibility.
version: Model version (default: “1.0”). **kwargs: Additional arguments (ignored).
- property espeak_backend
Lazy initialization of espeak backend.
- Raises:
RuntimeError: If espeak backend initialization or validation fails.
- __call__(text: str) list[GToken][source]
Convert text to tokens with phonemes.
- Args:
text: Input text to convert.
- Returns:
List of GToken objects with phonemes.
- lookup(word: str, tag: str | None = None) str | None[source]
Look up a word using espeak.
- Args:
word: The word to look up. tag: Optional POS tag (ignored for espeak).
- Returns:
Phoneme string from espeak, or None if strict=False and error occurs.
- Raises:
RuntimeError: If espeak backend fails and strict=True.
goruut Backend
- class kokorog2p.goruut_g2p.GoruutOnlyG2P(language: str = 'en-us', use_espeak_fallback: bool = False, use_goruut_fallback: bool = True, strict: bool = True, version: str = '1.0', **kwargs)[source]
Bases:
G2PBaseG2P implementation using only pygoruut/goruut.
This is used as an alternative to espeak for languages that pygoruut supports well. It provides phonemization via the goruut engine.
- Example:
>>> g2p = GoruutOnlyG2P("fr") >>> tokens = g2p("Bonjour le monde")
- __init__(language: str = 'en-us', use_espeak_fallback: bool = False, use_goruut_fallback: bool = True, strict: bool = True, version: str = '1.0', **kwargs) None[source]
Initialize the goruut-only G2P.
- Args:
language: Language code (e.g., ‘fr’, ‘de’, ‘en-us’). use_espeak_fallback: Ignored (always uses goruut). use_goruut_fallback: Ignored (always uses goruut). strict: If True (default), raise exceptions on errors. If False,
log warnings and return empty results for backward compatibility.
version: Model version (default: “1.0”). **kwargs: Additional arguments (ignored).
- property goruut_backend
Lazy initialization of goruut backend.
- __call__(text: str) list[GToken][source]
Convert text to tokens with phonemes.
- Args:
text: Input text to convert.
- Returns:
List of GToken objects with phonemes.
- lookup(word: str, tag: str | None = None) str | None[source]
Look up a word using goruut.
- Args:
word: The word to look up. tag: Optional POS tag (ignored for goruut).
- Returns:
Phoneme string from goruut, or None if strict=False and error occurs.
- Raises:
RuntimeError: If goruut backend fails and strict=True.
Examples
espeak Backend
from kokorog2p.backends.espeak import EspeakBackend
backend = EspeakBackend(language="en-us")
phonemes = backend.phonemize("hello")
print(phonemes)
Using espeak-only G2P
from kokorog2p.espeak_g2p import EspeakOnlyG2P
# Strict mode (default) - raises errors if espeak fails
g2p = EspeakOnlyG2P(language="es-es", strict=True)
tokens = g2p("Hola mundo")
for token in tokens:
print(f"{token.text} -> {token.phonemes}")
# Lenient mode - returns empty on errors (backward compatible)
g2p_lenient = EspeakOnlyG2P(language="es-es", strict=False)
tokens = g2p_lenient("Hola mundo")
Using goruut Backend
from kokorog2p.goruut_g2p import GoruutOnlyG2P
# Strict mode (default)
g2p = GoruutOnlyG2P(language="en-us", strict=True)
tokens = g2p("Hello world")
for token in tokens:
print(f"{token.text} -> {token.phonemes}")
# Lenient mode
g2p_lenient = GoruutOnlyG2P(language="en-us", strict=False)
tokens = g2p_lenient("Hello world")