Class ProbabilisticClassifierBase<T>

Namespace: AiDotNet.Classification

Assembly: AiDotNet.dll

Provides a base implementation for probabilistic classification algorithms that output class probability estimates.

public abstract class ProbabilisticClassifierBase<T> : ClassifierBase<T>, IProbabilisticClassifier<T>, IClassifier<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>

Type Parameters

T: The numeric data type used for calculations (e.g., float, double).

Inheritance: object

ClassifierBase<T>

ProbabilisticClassifierBase<T>

Implements: IProbabilisticClassifier<T>

IClassifier<T>

IFullModel<T, Matrix<T>, Vector<T>>

IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Matrix<T>, Vector<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>

IGradientComputable<T, Matrix<T>, Vector<T>>

IJitCompilable<T>

Derived: LinearDiscriminantAnalysis<T>

QuadraticDiscriminantAnalysis<T>

EnsembleClassifierBase<T>

LinearClassifierBase<T>

MetaClassifierBase<T>

NaiveBayesBase<T>

KNeighborsClassifier<T>

SVMBase<T>

DecisionTreeClassifier<T>

Inherited Members: ClassifierBase<T>.NumOps

ClassifierBase<T>.Engine

ClassifierBase<T>.Options

ClassifierBase<T>.Regularization

ClassifierBase<T>.NumClasses

ClassifierBase<T>.TaskType

ClassifierBase<T>.ClassLabels

ClassifierBase<T>.NumFeatures

ClassifierBase<T>.FeatureNames

ClassifierBase<T>.ExpectedParameterCount

ClassifierBase<T>.Train(Matrix<T>, Vector<T>)

ClassifierBase<T>.InferTaskType(Vector<T>)

ClassifierBase<T>.ExtractClassLabels(Vector<T>)

ClassifierBase<T>.ComputeClassWeights(Vector<T>)

ClassifierBase<T>.GetClassIndexFromLabel(T)

ClassifierBase<T>.GetModelType()

ClassifierBase<T>.Serialize()

ClassifierBase<T>.Deserialize(byte[])

ClassifierBase<T>.GetParameters()

ClassifierBase<T>.WithParameters(Vector<T>)

ClassifierBase<T>.SetParameters(Vector<T>)

ClassifierBase<T>.GetActiveFeatureIndices()

ClassifierBase<T>.IsFeatureUsed(int)

ClassifierBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

ClassifierBase<T>.GetFeatureImportance()

ClassifierBase<T>.DeepCopy()

ClassifierBase<T>.CreateNewInstance()

ClassifierBase<T>.Clone()

ClassifierBase<T>.ParameterCount

ClassifierBase<T>.DefaultLossFunction

ClassifierBase<T>.ComputeGradients(Matrix<T>, Vector<T>, ILossFunction<T>)

ClassifierBase<T>.ApplyGradients(Vector<T>, T)

ClassifierBase<T>.SaveModel(string)

ClassifierBase<T>.LoadModel(string)

ClassifierBase<T>.SaveState(Stream)

ClassifierBase<T>.LoadState(Stream)

ClassifierBase<T>.SupportsJitCompilation

ClassifierBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

This abstract class extends ClassifierBase to add probabilistic prediction capabilities. Probabilistic classifiers can output not just the predicted class, but also the probability of each class. This is useful for understanding model confidence and making threshold-based decisions.

The default Predict() method uses argmax of the probabilities to determine the class.

For Beginners: Probabilistic classifiers don't just say "this is category A" - they tell you how confident they are. For example, instead of just "spam", they might say "92% spam, 8% not spam."

This additional information is valuable because:

You can see when the model is uncertain (close to 50%/50%)
You can adjust the decision threshold for your specific needs
You can combine predictions from multiple models more effectively

Constructors

ProbabilisticClassifierBase(ClassifierOptions<T>?, IRegularization<T, Matrix<T>, Vector<T>>?, ILossFunction<T>?)

Initializes a new instance of the ProbabilisticClassifierBase class.

protected ProbabilisticClassifierBase(ClassifierOptions<T>? options = null, IRegularization<T, Matrix<T>, Vector<T>>? regularization = null, ILossFunction<T>? lossFunction = null)

Parameters

options ClassifierOptions<T>: Configuration options for the classifier model. If null, default options will be used.
regularization IRegularization<T, Matrix<T>, Vector<T>>: Regularization method to prevent overfitting. If null, no regularization will be applied.
lossFunction ILossFunction<T>: Loss function for gradient computation. If null, defaults to Cross Entropy.

Methods

ApplySigmoid(Vector<T>)

Applies sigmoid function for binary classification probabilities.

protected Vector<T> ApplySigmoid(Vector<T> scores)

Parameters

scores Vector<T>: A vector of raw scores.

Returns

Vector<T>: A vector of probabilities (values between 0 and 1).

Remarks

The sigmoid function σ(x) = 1 / (1 + exp(-x)) converts any real number to a value between 0 and 1, making it suitable for binary classification.

For Beginners: Sigmoid squashes any number into the range [0, 1].

Very negative numbers → close to 0
Zero → 0.5
Very positive numbers → close to 1

This is perfect for binary classification where you need a probability for the positive class. For example:

Score -3.0 → Probability 0.047 (unlikely positive)
Score 0.0 → Probability 0.500 (uncertain)
Score 3.0 → Probability 0.953 (likely positive)

ApplySoftmax(Matrix<T>)

Applies softmax normalization to convert raw scores to probabilities.

protected Matrix<T> ApplySoftmax(Matrix<T> scores)

Parameters

scores Matrix<T>: A matrix of raw scores [num_samples, num_classes].

Returns

Matrix<T>: A matrix of probabilities where each row sums to 1.0.

Remarks

The softmax function converts arbitrary real-valued scores into a probability distribution. It's defined as: softmax(x_i) = exp(x_i) / sum(exp(x_j)) for all j.

For Beginners: Softmax is a way to convert any set of numbers into probabilities.

It has two key properties:

All output values are between 0 and 1
All output values sum to 1.0 (so they form a valid probability distribution)

For example:

Scores [2.0, 1.0, 0.1] → Probabilities [0.659, 0.242, 0.099]
Notice: 0.659 + 0.242 + 0.099 = 1.0

Higher scores result in higher probabilities, but the relationship is exponential (not linear), so the highest score gets a disproportionately large probability.

GetModelMetadata()

Gets metadata about the model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>: A ModelMetadata object containing information about the model.

Remarks

This method returns metadata about the model, including its type, feature count, complexity, description, and additional information specific to classification.

For Beginners: Model metadata provides information about the model itself, rather than the predictions it makes. This includes details about the model's structure (like how many features it uses) and characteristics (like how many classes it can predict). This information can be useful for understanding and comparing different models.

Predict(Matrix<T>)

Predicts class labels for the given input data by taking the argmax of probabilities.

public override Vector<T> Predict(Matrix<T> input)

Parameters

input Matrix<T>: The input features matrix where each row is an example and each column is a feature.

Returns

Vector<T>: A vector of predicted class indices for each input example.

Remarks

This implementation uses the argmax of the probability distribution to determine the predicted class. For binary classification with a custom decision threshold, you may want to use PredictProbabilities() directly and apply your own threshold.

For Beginners: This method picks the class with the highest probability for each sample.

For example, if the probabilities are [0.1, 0.7, 0.2] for classes [A, B, C], this method returns class B (index 1) because it has the highest probability (0.7).

PredictLogProbabilities(Matrix<T>)

Predicts log-probabilities for each class.

public virtual Matrix<T> PredictLogProbabilities(Matrix<T> input)

Parameters

input Matrix<T>: The input features matrix where each row is a sample and each column is a feature.

Returns

Matrix<T>: A matrix where each row corresponds to an input sample and each column corresponds to a class. The values are the natural logarithm of the class probabilities.

Remarks

The default implementation computes log(PredictProbabilities(input)). Subclasses that compute log-probabilities directly (like Naive Bayes) should override this method for better numerical stability.

For Beginners: Log-probabilities are probabilities transformed by the natural logarithm. They're useful for numerical stability when working with very small probabilities.

For example:

Probability 0.9 → Log-probability -0.105
Probability 0.1 → Log-probability -2.303
Probability 0.001 → Log-probability -6.908

Log-probabilities are always negative (since probabilities are between 0 and 1). Higher (less negative) values mean higher probability.

PredictProbabilities(Matrix<T>)

Predicts class probabilities for each sample in the input.

public abstract Matrix<T> PredictProbabilities(Matrix<T> input)

Parameters

input Matrix<T>: The input features matrix where each row is a sample and each column is a feature.

Returns

Matrix<T>: A matrix where each row corresponds to an input sample and each column corresponds to a class. The values represent the probability of the sample belonging to each class.

Remarks

This abstract method must be implemented by derived classes to compute class probabilities. The output matrix should have shape [num_samples, num_classes], and each row should sum to 1.0.

For Beginners: This method computes the probability of each sample belonging to each class. Each row in the output represents one sample, and each column represents one class. The values in each row sum to 1.0 (100% total probability).

Table of Contents

Class ProbabilisticClassifierBase<T>

Type Parameters

Remarks

Constructors

ProbabilisticClassifierBase(ClassifierOptions<T>?, IRegularization<T, Matrix<T>, Vector<T>>?, ILossFunction<T>?)

Parameters

Methods

ApplySigmoid(Vector<T>)

Parameters

Returns

Remarks

ApplySoftmax(Matrix<T>)

Parameters

Returns

Remarks

GetModelMetadata()

Returns

Remarks

Predict(Matrix<T>)

Parameters

Returns

Remarks

PredictLogProbabilities(Matrix<T>)

Parameters

Returns

Remarks

PredictProbabilities(Matrix<T>)

Parameters

Returns

Remarks