Class SparseCategoricalCrossEntropyLoss<T>

Namespace: AiDotNet.LossFunctions

Assembly: AiDotNet.dll

Implements the Sparse Categorical Cross Entropy loss function for multi-class classification with integer labels.

public class SparseCategoricalCrossEntropyLoss<T> : LossFunctionBase<T>, ILossFunction<T>

Type Parameters

T: The numeric type used for calculations (e.g., float, double).

Inheritance: object

LossFunctionBase<T>

SparseCategoricalCrossEntropyLoss<T>

Implements: ILossFunction<T>

Inherited Members: LossFunctionBase<T>.NumOps

LossFunctionBase<T>.Engine

LossFunctionBase<T>.CalculateLossAndGradientGpu(IGpuTensor<T>, IGpuTensor<T>)

LossFunctionBase<T>.ValidateVectorLengths(Vector<T>, Vector<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: LossFunctionExtensions.ComputeGradient<T>(ILossFunction<T>, Tensor<T>, Tensor<T>)

LossFunctionExtensions.ComputeLoss<T>(ILossFunction<T>, Tensor<T>, Tensor<T>)

Remarks

For Beginners: Sparse Categorical Cross Entropy is similar to Categorical Cross Entropy but is used when labels are provided as class indices (0, 1, 2, ...) rather than one-hot encoded vectors.

This is more memory efficient for problems with many classes, as you only need to store the class index instead of a full one-hot encoded vector.

The formula is: SCCE = -(1/n) * Σ[log(predicted[actual_class_index])]

Where:

actual contains the class indices (e.g., 0, 1, 2, 3 for a 4-class problem)
predicted contains the predicted probabilities for all classes
We extract the probability for the correct class using the index from actual

Example:

If actual[i] = 2.0 (class index 2), and predicted has probabilities [0.1, 0.2, 0.6, 0.1], then we take predicted[2] = 0.6 and compute -log(0.6)

Key properties:

More memory efficient than categorical cross-entropy for many-class problems
Predicted values should be probabilities (between 0 and 1) from a softmax layer
Actual values should be valid class indices (0 to num_classes-1)
Often used with the softmax activation function in neural networks

To use this loss function with the Vector interface:

For a single sample: predicted = [p_class0, p_class1, ..., p_classN], actual = [true_class_index]
For batches: flatten your data appropriately or process samples individually

Constructors

SparseCategoricalCrossEntropyLoss()

Initializes a new instance of the SparseCategoricalCrossEntropyLoss class.

public SparseCategoricalCrossEntropyLoss()

Methods

CalculateDerivative(Vector<T>, Vector<T>)

Calculates the derivative of the Sparse Categorical Cross Entropy loss function.

public override Vector<T> CalculateDerivative(Vector<T> predicted, Vector<T> actual)

Parameters

predicted Vector<T>: The predicted probability values for all classes (length = num_classes).
actual Vector<T>: The actual class indices as floating-point values (length = batch_size or 1 for single sample).

Returns

Vector<T>: A vector containing the derivatives for each class probability.

Remarks

The derivative is:

For the correct class: -1 / predicted[correct_class]
For all other classes: 0

When used with softmax activation, this combines with the softmax derivative to produce the simplified gradient (predicted - one_hot_actual).

Exceptions

ArgumentException: Thrown when class indices are invalid or vectors are empty.

CalculateLoss(Vector<T>, Vector<T>)

Calculates the Sparse Categorical Cross Entropy loss between predicted probabilities and class indices.

public override T CalculateLoss(Vector<T> predicted, Vector<T> actual)

Parameters

predicted Vector<T>: The predicted probability values for all classes (length = num_classes).
actual Vector<T>: The actual class indices as floating-point values (length = batch_size or 1 for single sample).

Returns

T: The sparse categorical cross entropy loss value.

Remarks

For single-sample usage, if predicted has N classes and actual[0] = k (class index k), the loss is -log(predicted[k]).

Unlike other loss functions, predicted and actual can have different lengths:

predicted.Length = number of classes (N)
actual.Length = number of samples in batch (M) Each actual[i] contains the class index for sample i.

Exceptions

ArgumentException: Thrown when class indices are invalid or vectors are empty.

Table of Contents

Class SparseCategoricalCrossEntropyLoss<T>

Type Parameters

Remarks

Constructors

SparseCategoricalCrossEntropyLoss()

Methods

CalculateDerivative(Vector<T>, Vector<T>)

Parameters

Returns

Remarks

Exceptions

CalculateLoss(Vector<T>, Vector<T>)

Parameters

Returns

Remarks

Exceptions