Class SparseCategoricalCrossEntropyLoss<T>
- Namespace
- AiDotNet.LossFunctions
- Assembly
- AiDotNet.dll
Implements the Sparse Categorical Cross Entropy loss function for multi-class classification with integer labels.
public class SparseCategoricalCrossEntropyLoss<T> : LossFunctionBase<T>, ILossFunction<T>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
- Inheritance
-
SparseCategoricalCrossEntropyLoss<T>
- Implements
- Inherited Members
- Extension Methods
Remarks
For Beginners: Sparse Categorical Cross Entropy is similar to Categorical Cross Entropy but is used when labels are provided as class indices (0, 1, 2, ...) rather than one-hot encoded vectors.
This is more memory efficient for problems with many classes, as you only need to store the class index instead of a full one-hot encoded vector.
The formula is: SCCE = -(1/n) * Σ[log(predicted[actual_class_index])]
Where:
- actual contains the class indices (e.g., 0, 1, 2, 3 for a 4-class problem)
- predicted contains the predicted probabilities for all classes
- We extract the probability for the correct class using the index from actual
Example:
- If actual[i] = 2.0 (class index 2), and predicted has probabilities [0.1, 0.2, 0.6, 0.1], then we take predicted[2] = 0.6 and compute -log(0.6)
Key properties:
- More memory efficient than categorical cross-entropy for many-class problems
- Predicted values should be probabilities (between 0 and 1) from a softmax layer
- Actual values should be valid class indices (0 to num_classes-1)
- Often used with the softmax activation function in neural networks
To use this loss function with the Vector interface:
- For a single sample: predicted = [p_class0, p_class1, ..., p_classN], actual = [true_class_index]
- For batches: flatten your data appropriately or process samples individually
Constructors
SparseCategoricalCrossEntropyLoss()
Initializes a new instance of the SparseCategoricalCrossEntropyLoss class.
public SparseCategoricalCrossEntropyLoss()
Methods
CalculateDerivative(Vector<T>, Vector<T>)
Calculates the derivative of the Sparse Categorical Cross Entropy loss function.
public override Vector<T> CalculateDerivative(Vector<T> predicted, Vector<T> actual)
Parameters
predictedVector<T>The predicted probability values for all classes (length = num_classes).
actualVector<T>The actual class indices as floating-point values (length = batch_size or 1 for single sample).
Returns
- Vector<T>
A vector containing the derivatives for each class probability.
Remarks
The derivative is:
- For the correct class: -1 / predicted[correct_class]
- For all other classes: 0
When used with softmax activation, this combines with the softmax derivative to produce the simplified gradient (predicted - one_hot_actual).
Exceptions
- ArgumentException
Thrown when class indices are invalid or vectors are empty.
CalculateLoss(Vector<T>, Vector<T>)
Calculates the Sparse Categorical Cross Entropy loss between predicted probabilities and class indices.
public override T CalculateLoss(Vector<T> predicted, Vector<T> actual)
Parameters
predictedVector<T>The predicted probability values for all classes (length = num_classes).
actualVector<T>The actual class indices as floating-point values (length = batch_size or 1 for single sample).
Returns
- T
The sparse categorical cross entropy loss value.
Remarks
For single-sample usage, if predicted has N classes and actual[0] = k (class index k), the loss is -log(predicted[k]).
Unlike other loss functions, predicted and actual can have different lengths:
- predicted.Length = number of classes (N)
- actual.Length = number of samples in batch (M) Each actual[i] contains the class index for sample i.
Exceptions
- ArgumentException
Thrown when class indices are invalid or vectors are empty.