Table of Contents

Class TtsOptions

Namespace
AiDotNet.Audio.TextToSpeech
Assembly
AiDotNet.dll

Configuration options for text-to-speech models.

public class TtsOptions
Inheritance
TtsOptions
Inherited Members

Remarks

TTS (Text-to-Speech) converts written text into natural-sounding speech. Modern TTS uses a two-stage pipeline: 1. Acoustic Model (e.g., FastSpeech2): Converts text to mel spectrogram 2. Vocoder (e.g., HiFi-GAN): Converts mel spectrogram to audio waveform

For Beginners: Think of TTS as the opposite of speech recognition. The acoustic model decides "what should this text sound like" (intonation, timing), and the vocoder makes it actually sound like speech.

Properties

AcousticModelPath

Gets or sets the path to the acoustic model (FastSpeech2) ONNX file.

public string? AcousticModelPath { get; set; }

Property Value

string

Energy

Gets or sets the energy/volume level. 1.0 = normal.

public double Energy { get; set; }

Property Value

double

FftSize

Gets or sets the FFT size.

public int FftSize { get; set; }

Property Value

int

GriffinLimIterations

Gets or sets the number of Griffin-Lim iterations if used.

public int GriffinLimIterations { get; set; }

Property Value

int

HopLength

Gets or sets the hop length.

public int HopLength { get; set; }

Property Value

int

Language

Gets or sets the language code for multi-lingual models.

public string? Language { get; set; }

Property Value

string

NumMels

Gets or sets the number of mel channels.

public int NumMels { get; set; }

Property Value

int

OnnxOptions

Gets or sets the ONNX execution options.

public OnnxModelOptions OnnxOptions { get; set; }

Property Value

OnnxModelOptions

PitchShift

Gets or sets the pitch shift in semitones. 0 = normal, negative = lower, positive = higher.

public double PitchShift { get; set; }

Property Value

double

SampleRate

Gets or sets the output sample rate.

public int SampleRate { get; set; }

Property Value

int

SpeakerId

Gets or sets the speaker ID for multi-speaker models.

public int? SpeakerId { get; set; }

Property Value

int?

SpeakingRate

Gets or sets the speaking rate multiplier. 1.0 = normal speed, 0.5 = half speed, 2.0 = double speed.

public double SpeakingRate { get; set; }

Property Value

double

UseGriffinLimFallback

Gets or sets whether to use Griffin-Lim as a fallback vocoder.

public bool UseGriffinLimFallback { get; set; }

Property Value

bool

VocoderModelPath

Gets or sets the path to the vocoder (HiFi-GAN) ONNX file.

public string? VocoderModelPath { get; set; }

Property Value

string