Table of Contents

Class SparseCategoricalCrossEntropyLoss<T>

Namespace
AiDotNet.LossFunctions
Assembly
AiDotNet.dll

Implements the Sparse Categorical Cross Entropy loss function for multi-class classification with integer labels.

public class SparseCategoricalCrossEntropyLoss<T> : LossFunctionBase<T>, ILossFunction<T>

Type Parameters

T

The numeric type used for calculations (e.g., float, double).

Inheritance
SparseCategoricalCrossEntropyLoss<T>
Implements
Inherited Members
Extension Methods

Remarks

For Beginners: Sparse Categorical Cross Entropy is similar to Categorical Cross Entropy but is used when labels are provided as class indices (0, 1, 2, ...) rather than one-hot encoded vectors.

This is more memory efficient for problems with many classes, as you only need to store the class index instead of a full one-hot encoded vector.

The formula is: SCCE = -(1/n) * Σ[log(predicted[actual_class_index])]

Where:

  • actual contains the class indices (e.g., 0, 1, 2, 3 for a 4-class problem)
  • predicted contains the predicted probabilities for all classes
  • We extract the probability for the correct class using the index from actual

Example:

  • If actual[i] = 2.0 (class index 2), and predicted has probabilities [0.1, 0.2, 0.6, 0.1], then we take predicted[2] = 0.6 and compute -log(0.6)

Key properties:

  • More memory efficient than categorical cross-entropy for many-class problems
  • Predicted values should be probabilities (between 0 and 1) from a softmax layer
  • Actual values should be valid class indices (0 to num_classes-1)
  • Often used with the softmax activation function in neural networks

To use this loss function with the Vector interface:

  • For a single sample: predicted = [p_class0, p_class1, ..., p_classN], actual = [true_class_index]
  • For batches: flatten your data appropriately or process samples individually

Constructors

SparseCategoricalCrossEntropyLoss()

Initializes a new instance of the SparseCategoricalCrossEntropyLoss class.

public SparseCategoricalCrossEntropyLoss()

Methods

CalculateDerivative(Vector<T>, Vector<T>)

Calculates the derivative of the Sparse Categorical Cross Entropy loss function.

public override Vector<T> CalculateDerivative(Vector<T> predicted, Vector<T> actual)

Parameters

predicted Vector<T>

The predicted probability values for all classes (length = num_classes).

actual Vector<T>

The actual class indices as floating-point values (length = batch_size or 1 for single sample).

Returns

Vector<T>

A vector containing the derivatives for each class probability.

Remarks

The derivative is:

  • For the correct class: -1 / predicted[correct_class]
  • For all other classes: 0

When used with softmax activation, this combines with the softmax derivative to produce the simplified gradient (predicted - one_hot_actual).

Exceptions

ArgumentException

Thrown when class indices are invalid or vectors are empty.

CalculateLoss(Vector<T>, Vector<T>)

Calculates the Sparse Categorical Cross Entropy loss between predicted probabilities and class indices.

public override T CalculateLoss(Vector<T> predicted, Vector<T> actual)

Parameters

predicted Vector<T>

The predicted probability values for all classes (length = num_classes).

actual Vector<T>

The actual class indices as floating-point values (length = batch_size or 1 for single sample).

Returns

T

The sparse categorical cross entropy loss value.

Remarks

For single-sample usage, if predicted has N classes and actual[0] = k (class index k), the loss is -log(predicted[k]).

Unlike other loss functions, predicted and actual can have different lengths:

  • predicted.Length = number of classes (N)
  • actual.Length = number of samples in batch (M) Each actual[i] contains the class index for sample i.

Exceptions

ArgumentException

Thrown when class indices are invalid or vectors are empty.