Class TtsModel<T>

Namespace: AiDotNet.Audio.TextToSpeech

Assembly: AiDotNet.dll

Text-to-speech model for synthesizing speech from text.

public class TtsModel<T> : AudioNeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable, ITextToSpeech<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

NeuralNetworkBase<T>

AudioNeuralNetworkBase<T>

TtsModel<T>

Implements: INeuralNetworkModel<T>

INeuralNetwork<T>

IInterpretableModel<T>

IInputGradientComputable<T>

IDisposable

ITextToSpeech<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

Inherited Members: AudioNeuralNetworkBase<T>.SampleRate

AudioNeuralNetworkBase<T>.NumMels

AudioNeuralNetworkBase<T>.IsOnnxMode

AudioNeuralNetworkBase<T>.OnnxEncoder

AudioNeuralNetworkBase<T>.OnnxDecoder

AudioNeuralNetworkBase<T>.OnnxModel

AudioNeuralNetworkBase<T>.MelSpec

AudioNeuralNetworkBase<T>.SupportsTraining

AudioNeuralNetworkBase<T>.RunOnnxInference(Tensor<T>)

AudioNeuralNetworkBase<T>.Forward(Tensor<T>)

AudioNeuralNetworkBase<T>.DefaultLossFunction

AudioNeuralNetworkBase<T>.CreateMelSpectrogram(int, int, int, int)

NeuralNetworkBase<T>.Layers

NeuralNetworkBase<T>.LayerCount

NeuralNetworkBase<T>.Architecture

NeuralNetworkBase<T>.NumOps

NeuralNetworkBase<T>.Engine

NeuralNetworkBase<T>._layerInputs

NeuralNetworkBase<T>._layerOutputs

NeuralNetworkBase<T>.Random

NeuralNetworkBase<T>.LossFunction

NeuralNetworkBase<T>.LastLoss

NeuralNetworkBase<T>.IsTrainingMode

NeuralNetworkBase<T>.SupportsGpuTraining

NeuralNetworkBase<T>.CanTrainOnGpu

NeuralNetworkBase<T>.GpuEngine

NeuralNetworkBase<T>.MaxGradNorm

NeuralNetworkBase<T>._mixedPrecisionContext

NeuralNetworkBase<T>._memoryManager

NeuralNetworkBase<T>.IsMemoryManagementEnabled

NeuralNetworkBase<T>.IsGradientCheckpointingEnabled

NeuralNetworkBase<T>.IsMixedPrecisionEnabled

NeuralNetworkBase<T>.ClipGradients(List<Tensor<T>>)

NeuralNetworkBase<T>.ClipGradient(Tensor<T>)

NeuralNetworkBase<T>.ClipGradient(Vector<T>)

NeuralNetworkBase<T>.GetParameters()

NeuralNetworkBase<T>.Backpropagate(Tensor<T>)

NeuralNetworkBase<T>.BackpropagateWithRecompute(Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpuDeferred(IGpuTensor<T>, GpuExecutionOptions)

NeuralNetworkBase<T>.UpdateParametersGpu(T, T, T)

NeuralNetworkBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

NeuralNetworkBase<T>.UpdateParametersGpuDeferred(IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferred(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferredAsync(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions, CancellationToken)

NeuralNetworkBase<T>.UploadWeightsToGpu()

NeuralNetworkBase<T>.DownloadWeightsFromGpu()

NeuralNetworkBase<T>.ZeroGradientsGpu()

NeuralNetworkBase<T>.ExtractSingleExample(Tensor<T>, int)

NeuralNetworkBase<T>.ForwardWithMemory(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithCheckpointing(Tensor<T>)

NeuralNetworkBase<T>.CanUseGpuResidentPath()

NeuralNetworkBase<T>.TryForwardGpuOptimized(Tensor<T>, out Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferred(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferredAsync(Tensor<T>, CancellationToken)

NeuralNetworkBase<T>.BeginGpuExecution(GpuExecutionOptions)

NeuralNetworkBase<T>.ForwardWithGpuContext(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithGpuContext(IGpuTensor<T>)

NeuralNetworkBase<T>.GetGpuMemoryStats()

NeuralNetworkBase<T>.ForwardWithFeatures(Tensor<T>, int[])

NeuralNetworkBase<T>.ParameterCount

NeuralNetworkBase<T>.GetParameterCount()

NeuralNetworkBase<T>.InvalidateParameterCountCache()

NeuralNetworkBase<T>.AddLayerToCollection(ILayer<T>)

NeuralNetworkBase<T>.RemoveLayerFromCollection(ILayer<T>)

NeuralNetworkBase<T>.ClearLayers()

NeuralNetworkBase<T>.ValidateCustomLayers(List<ILayer<T>>)

NeuralNetworkBase<T>.ValidateCustomLayersInternal(List<ILayer<T>>)

NeuralNetworkBase<T>.IsValidInputLayer(ILayer<T>)

NeuralNetworkBase<T>.IsValidOutputLayer(ILayer<T>)

NeuralNetworkBase<T>.AreLayersCompatible(ILayer<T>, ILayer<T>)

NeuralNetworkBase<T>.GetParameterGradients()

NeuralNetworkBase<T>.EnsureArchitectureInitialized()

NeuralNetworkBase<T>.SetTrainingMode(bool)

NeuralNetworkBase<T>.EnableMemoryManagement(TrainingMemoryConfig)

NeuralNetworkBase<T>.DisableMemoryManagement()

NeuralNetworkBase<T>.GetMemoryEstimate(int, int)

NeuralNetworkBase<T>.GetLastLoss()

NeuralNetworkBase<T>.ResetState()

NeuralNetworkBase<T>.BackwardWithInputGradient(Tensor<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Vector<T>, Vector<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.SaveModel(string)

NeuralNetworkBase<T>.LoadModel(string)

NeuralNetworkBase<T>.Serialize()

NeuralNetworkBase<T>.Deserialize(byte[])

NeuralNetworkBase<T>.WithParameters(Vector<T>)

NeuralNetworkBase<T>.GetActiveFeatureIndices()

NeuralNetworkBase<T>.IsFeatureUsed(int)

NeuralNetworkBase<T>.DeepCopy()

NeuralNetworkBase<T>.Clone()

NeuralNetworkBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

NeuralNetworkBase<T>._enabledMethods

NeuralNetworkBase<T>._sensitiveFeatures

NeuralNetworkBase<T>._fairnessMetrics

NeuralNetworkBase<T>._baseModel

NeuralNetworkBase<T>.GetGlobalFeatureImportanceAsync()

NeuralNetworkBase<T>.GetLocalFeatureImportanceAsync(Tensor<T>)

NeuralNetworkBase<T>.GetShapValuesAsync(Tensor<T>)

NeuralNetworkBase<T>.GetLimeExplanationAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetPartialDependenceAsync(Vector<int>, int)

NeuralNetworkBase<T>.GetCounterfactualAsync(Tensor<T>, Tensor<T>, int)

NeuralNetworkBase<T>.GetModelSpecificInterpretabilityAsync()

NeuralNetworkBase<T>.GenerateTextExplanationAsync(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.GetFeatureInteractionAsync(int, int)

NeuralNetworkBase<T>.ValidateFairnessAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetAnchorExplanationAsync(Tensor<T>, T)

NeuralNetworkBase<T>.SetBaseModel<TInput, TOutput>(IFullModel<T, TInput, TOutput>)

NeuralNetworkBase<T>.EnableMethod(params InterpretationMethod[])

NeuralNetworkBase<T>.ConfigureFairness(Vector<int>, params FairnessMetric[])

NeuralNetworkBase<T>.GetNamedLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.GetArchitecture()

NeuralNetworkBase<T>.GetFeatureImportance()

NeuralNetworkBase<T>.SetParameters(Vector<T>)

NeuralNetworkBase<T>.AddLayer(LayerType, int, ActivationFunction)

NeuralNetworkBase<T>.AddConvolutionalLayer(int, int, int, ActivationFunction)

NeuralNetworkBase<T>.AddLSTMLayer(int, bool)

NeuralNetworkBase<T>.AddDropoutLayer(double)

NeuralNetworkBase<T>.AddBatchNormalizationLayer(int, double, double)

NeuralNetworkBase<T>.AddPoolingLayer(int[], PoolingType, int, int?)

NeuralNetworkBase<T>.GetGradients()

NeuralNetworkBase<T>.GetInputShape()

NeuralNetworkBase<T>.GetLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

NeuralNetworkBase<T>.ApplyGradients(Vector<T>, T)

NeuralNetworkBase<T>.SaveState(Stream)

NeuralNetworkBase<T>.LoadState(Stream)

NeuralNetworkBase<T>.Dispose()

NeuralNetworkBase<T>.SupportsJitCompilation

NeuralNetworkBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

NeuralNetworkBase<T>.ConvertLayerToGraph(ILayer<T>, ComputationNode<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

This TTS model uses a two-stage pipeline: 1. Acoustic Model (FastSpeech2): Converts text/phonemes to mel spectrogram 2. Vocoder (HiFi-GAN or Griffin-Lim): Converts mel spectrogram to audio waveform

For Beginners: Text-to-Speech works like this: 1. Your text is converted to phonemes (speech sounds) 2. The acoustic model predicts what the speech should "look like" (mel spectrogram) 3. The vocoder makes it actually sound like speech

This class supports two modes:

ONNX Mode: Load pretrained FastSpeech2/HiFi-GAN models for instant synthesis
Native Mode: Train your own TTS model from scratch

Usage (ONNX Mode):

var tts = new TtsModel<float>(
    architecture,
    acousticModelPath: "path/to/fastspeech2.onnx",
    vocoderModelPath: "path/to/hifigan.onnx");

var audio = tts.Synthesize("Hello, world!");

Usage (Native Training Mode):

var tts = new TtsModel<float>(
    architecture,
    optimizer: new AdamOptimizer<float>(),
    lossFunction: new MeanSquaredErrorLoss<float>());

tts.Train(phonemeInput, expectedMelSpectrogram);

Constructors

TtsModel(NeuralNetworkArchitecture<T>, int, int, double, double, double, int?, string?, int, int, int, int, int, int, int, int, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?, ILossFunction<T>?)

Creates a TtsModel for native training mode.

public TtsModel(NeuralNetworkArchitecture<T> architecture, int sampleRate = 22050, int numMels = 80, double speakingRate = 1, double pitchShift = 0, double energy = 1, int? speakerId = null, string? language = null, int hiddenDim = 256, int numHeads = 4, int numEncoderLayers = 4, int numDecoderLayers = 4, int maxPhonemeLength = 256, int fftSize = 1024, int hopLength = 256, int griffinLimIterations = 60, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? optimizer = null, ILossFunction<T>? lossFunction = null)

Parameters

architecture NeuralNetworkArchitecture<T>: The neural network architecture configuration.
sampleRate int: Output sample rate in Hz. Default is 22050 (standard for TTS).
numMels int: Number of mel spectrogram channels. Default is 80.
speakingRate double: Speaking rate multiplier. 1.0 = normal speed. Default is 1.0.
pitchShift double: Pitch shift in semitones. 0 = normal. Default is 0.
energy double: Energy/volume level. 1.0 = normal. Default is 1.0.
speakerId int?: Speaker ID for multi-speaker models. Default is null.
language string: Language code for multi-lingual models. Default is null.
hiddenDim int: Hidden dimension for acoustic model. Default is 256.
numHeads int: Number of attention heads. Default is 4.
numEncoderLayers int: Number of encoder layers. Default is 4.
numDecoderLayers int: Number of decoder layers. Default is 4.
maxPhonemeLength int: Maximum phoneme sequence length. Default is 256.
fftSize int: FFT size for Griffin-Lim. Default is 1024.
hopLength int: Hop length for Griffin-Lim. Default is 256.
griffinLimIterations int: Number of Griffin-Lim iterations. Default is 60.
optimizer IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>: Optimizer for training. If null, a default Adam optimizer is used.
lossFunction ILossFunction<T>: Loss function for training. If null, MSE loss is used.

Remarks

For Beginners: Use this constructor to train your own TTS model.

You'll need a dataset of (phoneme sequence, mel spectrogram) pairs. Training TTS from scratch requires significant data and compute resources.

Example:

var tts = new TtsModel<float>(
    architecture,
    optimizer: new AdamOptimizer<float>(),
    lossFunction: new MeanSquaredErrorLoss<float>());

// Train on your dataset
tts.Train(phonemeInput, expectedMelSpectrogram);

TtsModel(NeuralNetworkArchitecture<T>, string, string?, int, int, double, double, double, int?, string?, bool, int, int, int, OnnxModelOptions?)

Creates a TtsModel for ONNX inference with pretrained models.

public TtsModel(NeuralNetworkArchitecture<T> architecture, string acousticModelPath, string? vocoderModelPath = null, int sampleRate = 22050, int numMels = 80, double speakingRate = 1, double pitchShift = 0, double energy = 1, int? speakerId = null, string? language = null, bool useGriffinLimFallback = true, int griffinLimIterations = 60, int fftSize = 1024, int hopLength = 256, OnnxModelOptions? onnxOptions = null)

Parameters

architecture NeuralNetworkArchitecture<T>: The neural network architecture configuration.
acousticModelPath string: Required path to acoustic model ONNX file (e.g., FastSpeech2).
vocoderModelPath string: Optional path to vocoder ONNX file (e.g., HiFi-GAN). If null, uses Griffin-Lim.
sampleRate int: Output sample rate in Hz. Default is 22050 (standard for TTS).
numMels int: Number of mel spectrogram channels. Default is 80.
speakingRate double: Speaking rate multiplier. 1.0 = normal speed. Default is 1.0.
pitchShift double: Pitch shift in semitones. 0 = normal. Default is 0.
energy double: Energy/volume level. 1.0 = normal. Default is 1.0.
speakerId int?: Speaker ID for multi-speaker models. Default is null.
language string: Language code for multi-lingual models. Default is null.
useGriffinLimFallback bool: Whether to use Griffin-Lim as fallback. Default is true.
griffinLimIterations int: Number of Griffin-Lim iterations. Default is 60.
fftSize int: FFT size for Griffin-Lim. Default is 1024.
hopLength int: Hop length for Griffin-Lim. Default is 256.
onnxOptions OnnxModelOptions: ONNX runtime options.

Remarks

For Beginners: Use this constructor when you have pretrained TTS models.

You need at least an acoustic model (converts text to mel spectrogram). The vocoder (converts mel to audio) is optional - Griffin-Lim can be used as fallback.

Example:

var tts = new TtsModel<float>(
    architecture,
    acousticModelPath: "fastspeech2.onnx",
    vocoderModelPath: "hifigan.onnx");

Properties

AvailableVoices

Gets the list of available built-in voices.

public IReadOnlyList<VoiceInfo<T>> AvailableVoices { get; }

Property Value

IReadOnlyList<VoiceInfo<T>>

IsReady

Gets whether the model is ready for synthesis.

public bool IsReady { get; }

Property Value

bool

SupportsEmotionControl

Gets whether this model supports emotional expression control.

public bool SupportsEmotionControl { get; }

Property Value

bool

SupportsStreaming

Gets whether this model supports streaming audio generation.

public bool SupportsStreaming { get; }

Property Value

bool

SupportsVoiceCloning

Gets whether this model supports voice cloning from reference audio.

public bool SupportsVoiceCloning { get; }

Property Value

bool

Methods

CreateNewInstance()

Creates a new instance of this model for cloning.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

DeserializeNetworkSpecificData(BinaryReader)

Deserializes network-specific data.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader

Dispose(bool)

Disposes the model and releases resources.

protected override void Dispose(bool disposing)

Parameters

disposing bool

ExtractSpeakerEmbedding(Tensor<T>)

Extracts speaker embedding from reference audio for voice cloning.

public Tensor<T> ExtractSpeakerEmbedding(Tensor<T> referenceAudio)

Parameters

referenceAudio Tensor<T>

Returns

Tensor<T>

GetModelMetadata()

Gets metadata about the model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

InitializeLayers()

Initializes the layers for the TTS model.

protected override void InitializeLayers()

Remarks

Follows the golden standard pattern:

Check if in native mode (ONNX mode returns early)
Use Architecture.Layers if provided by user
Fall back to LayerHelper.CreateDefaultTtsLayers() otherwise

PostprocessOutput(Tensor<T>)

Postprocesses model output into the final result format.

protected override Tensor<T> PostprocessOutput(Tensor<T> modelOutput)

Parameters

modelOutput Tensor<T>

Returns

Tensor<T>

Predict(Tensor<T>)

Makes a prediction using the model.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>

Returns

Tensor<T>

PreprocessAudio(Tensor<T>)

Preprocesses raw audio for model input.

protected override Tensor<T> PreprocessAudio(Tensor<T> rawAudio)

Parameters

rawAudio Tensor<T>

Returns

Tensor<T>

SerializeNetworkSpecificData(BinaryWriter)

Serializes network-specific data.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter

StartStreamingSession(string?, double)

Starts a streaming synthesis session.

public IStreamingSynthesisSession<T> StartStreamingSession(string? voiceId = null, double speakingRate = 1)

Parameters

voiceId string
speakingRate double

Returns

IStreamingSynthesisSession<T>

Synthesize(string, string?, double, double)

Synthesizes speech from text.

public Tensor<T> Synthesize(string text, string? voiceId = null, double speakingRate = 1, double pitch = 0)

Parameters

text string
voiceId string
speakingRate double
pitch double

Returns

Tensor<T>

SynthesizeAsync(string, string?, double, double, CancellationToken)

Synthesizes speech from text asynchronously.

public Task<Tensor<T>> SynthesizeAsync(string text, string? voiceId = null, double speakingRate = 1, double pitch = 0, CancellationToken cancellationToken = default)

Parameters

text string
voiceId string
speakingRate double
pitch double
cancellationToken CancellationToken

Returns

Task<Tensor<T>>

SynthesizeWithEmotion(string, string, double, string?, double)

Synthesizes speech with emotional expression.

public Tensor<T> SynthesizeWithEmotion(string text, string emotion, double emotionIntensity = 0.5, string? voiceId = null, double speakingRate = 1)

Parameters

text string
emotion string
emotionIntensity double
voiceId string
speakingRate double

Returns

Tensor<T>

SynthesizeWithVoiceCloning(string, Tensor<T>, double, double)

Synthesizes speech using a cloned voice from reference audio.

public Tensor<T> SynthesizeWithVoiceCloning(string text, Tensor<T> referenceAudio, double speakingRate = 1, double pitch = 0)

Parameters

text string
referenceAudio Tensor<T>
speakingRate double
pitch double

Returns

Tensor<T>

Train(Tensor<T>, Tensor<T>)

Trains the model on input data.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>
expectedOutput Tensor<T>

UpdateParameters(Vector<T>)

Updates model parameters using gradient descent.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

Table of Contents

Class TtsModel<T>

Type Parameters

Remarks

Constructors

TtsModel(NeuralNetworkArchitecture<T>, int, int, double, double, double, int?, string?, int, int, int, int, int, int, int, int, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?, ILossFunction<T>?)

Parameters

Remarks

TtsModel(NeuralNetworkArchitecture<T>, string, string?, int, int, double, double, double, int?, string?, bool, int, int, int, OnnxModelOptions?)

Parameters

Remarks

Properties

AvailableVoices

Property Value

IsReady

Property Value

SupportsEmotionControl

Property Value

SupportsStreaming

Property Value

SupportsVoiceCloning

Property Value

Methods

CreateNewInstance()

Returns

DeserializeNetworkSpecificData(BinaryReader)

Parameters

Dispose(bool)

Parameters

ExtractSpeakerEmbedding(Tensor<T>)

Parameters

Returns

GetModelMetadata()

Returns

InitializeLayers()

Remarks

PostprocessOutput(Tensor<T>)

Parameters

Returns

Predict(Tensor<T>)

Parameters

Returns

PreprocessAudio(Tensor<T>)

Parameters

Returns

SerializeNetworkSpecificData(BinaryWriter)

Parameters

StartStreamingSession(string?, double)

Parameters

Returns

Synthesize(string, string?, double, double)

Parameters

Returns

SynthesizeAsync(string, string?, double, double, CancellationToken)

Parameters

Returns

SynthesizeWithEmotion(string, string, double, string?, double)

Parameters

Returns

SynthesizeWithVoiceCloning(string, Tensor<T>, double, double)

Parameters

Returns

Train(Tensor<T>, Tensor<T>)

Parameters

UpdateParameters(Vector<T>)

Parameters