Class CTCLoss<T>

Namespace: AiDotNet.LossFunctions

Assembly: AiDotNet.dll

Implements the Connectionist Temporal Classification (CTC) loss function for sequence-to-sequence learning.

public class CTCLoss<T> : ISequenceLossFunction<T>

Type Parameters

T: The numeric type used for calculations (e.g., float, double).

Inheritance: object

CTCLoss<T>

Implements: ISequenceLossFunction<T>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

For Beginners: Connectionist Temporal Classification (CTC) is a loss function designed for sequence-to-sequence learning problems where the alignment between input and output sequences is unknown.

For example, in speech recognition, we have:

Input: An audio waveform (long sequence of sound samples)
Output: Text transcript (shorter sequence of characters)

The key challenge is that we don't know exactly which parts of the audio correspond to each character. CTC solves this by considering all possible alignments between the input and output sequences.

CTC introduces a special "blank" token to handle:

Repetitions of characters (e.g., "hello" vs "hheellloo")
Silence or transitions between sounds

This loss function is commonly used in:

Speech recognition
Handwriting recognition
Any task where input and output sequences have different lengths and unknown alignment

Constructors

CTCLoss(int, bool)

Initializes a new instance of the CTCLoss class.

public CTCLoss(int blankIndex = 0, bool inputsAreLogProbs = true)

Parameters

blankIndex int: The index of the blank symbol in the vocabulary. Default is 0.
inputsAreLogProbs bool: Whether inputs are already in log space. Default is true.

Exceptions

ArgumentNullException: Thrown when numericOperations is null.
ArgumentOutOfRangeException: Thrown when blankIndex is negative.

Methods

CalculateGradient(Tensor<T>, int[][], int[], int[])

Calculates the gradient of the CTC loss with respect to the inputs.

public Tensor<T> CalculateGradient(Tensor<T> logProbs, int[][] targets, int[] inputLengths, int[] targetLengths)

Parameters

logProbs Tensor<T>: Log probabilities tensor [batch, time, classes].
targets int[][]: Target label sequences for each batch item.
inputLengths int[]: Actual lengths of each input sequence.
targetLengths int[]: Actual lengths of each target sequence.

Returns

Tensor<T>: The gradient tensor with same shape as inputs.

CalculateLoss(Tensor<T>, int[][], int[], int[])

Calculates the CTC loss for a batch of sequences.

public T CalculateLoss(Tensor<T> logProbs, int[][] targets, int[] inputLengths, int[] targetLengths)

Parameters

logProbs Tensor<T>: Log probabilities tensor [batch, time, classes].
targets int[][]: Target label sequences for each batch item.
inputLengths int[]: Actual lengths of each input sequence.
targetLengths int[]: Actual lengths of each target sequence.

Returns

T: The average CTC loss value across the batch.

Table of Contents

Class CTCLoss<T>

Type Parameters

Remarks

Constructors

CTCLoss(int, bool)

Parameters

Exceptions

Methods

CalculateGradient(Tensor<T>, int[][], int[], int[])

Parameters

Returns

CalculateLoss(Tensor<T>, int[][], int[], int[])

Parameters

Returns