Class PhonemeTokenizer
- Namespace
- AiDotNet.Tokenization.Specialized
- Assembly
- AiDotNet.dll
Phoneme-based tokenizer for speech synthesis (TTS) applications.
public class PhonemeTokenizer : TokenizerBase, ITokenizer
- Inheritance
-
PhonemeTokenizer
- Implements
- Inherited Members
Constructors
PhonemeTokenizer(IVocabulary, Dictionary<string, string>, SpecialTokens, PhonemeSet)
Creates a new phoneme tokenizer.
public PhonemeTokenizer(IVocabulary vocabulary, Dictionary<string, string> g2pRules, SpecialTokens specialTokens, PhonemeTokenizer.PhonemeSet phonemeSet = PhonemeSet.IPA)
Parameters
vocabularyIVocabularyg2pRulesDictionary<string, string>specialTokensSpecialTokensphonemeSetPhonemeTokenizer.PhonemeSet
Methods
CleanupTokens(List<string>)
Cleans up tokens and converts them back to text (must be implemented by derived classes).
protected override string CleanupTokens(List<string> tokens)
Parameters
Returns
CreateARPAbet(SpecialTokens?)
Creates a phoneme tokenizer with ARPAbet phonemes.
public static PhonemeTokenizer CreateARPAbet(SpecialTokens? specialTokens = null)
Parameters
specialTokensSpecialTokens
Returns
Tokenize(string)
Tokenizes text into phonemes.
public override List<string> Tokenize(string text)
Parameters
textstring