Table of Contents

Class NaiveBayesBase<T>

Namespace
AiDotNet.Classification.NaiveBayes
Assembly
AiDotNet.dll

Provides a base implementation for Naive Bayes classifiers.

public abstract class NaiveBayesBase<T> : ProbabilisticClassifierBase<T>, IProbabilisticClassifier<T>, IClassifier<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>

Type Parameters

T

The numeric data type used for calculations (e.g., float, double).

Inheritance
NaiveBayesBase<T>
Implements
IFullModel<T, Matrix<T>, Vector<T>>
IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>
IParameterizable<T, Matrix<T>, Vector<T>>
ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>
IGradientComputable<T, Matrix<T>, Vector<T>>
Derived
Inherited Members
Extension Methods

Remarks

Naive Bayes classifiers are probabilistic classifiers based on Bayes' theorem with strong (naive) independence assumptions between the features. Despite these assumptions, Naive Bayes classifiers often perform very well in practice.

For Beginners: Naive Bayes uses probability to make predictions. It learns from training data: 1. How common each class is (prior probability) 2. How likely each feature value is given each class (likelihood)

Then for a new sample, it calculates: P(class|features) ∝ P(class) × P(features|class) and picks the class with the highest probability.

Constructors

NaiveBayesBase(NaiveBayesOptions<T>?, IRegularization<T, Matrix<T>, Vector<T>>?)

Initializes a new instance of the NaiveBayesBase class.

protected NaiveBayesBase(NaiveBayesOptions<T>? options = null, IRegularization<T, Matrix<T>, Vector<T>>? regularization = null)

Parameters

options NaiveBayesOptions<T>

Configuration options for the Naive Bayes classifier.

regularization IRegularization<T, Matrix<T>, Vector<T>>

Optional regularization strategy.

Properties

ClassCounts

Stores the count of samples per class during training.

protected int[]? ClassCounts { get; set; }

Property Value

int[]

LogPriors

Stores the log prior probabilities for each class.

protected Vector<T>? LogPriors { get; set; }

Property Value

Vector<T>

Options

Gets the Naive Bayes specific options.

protected NaiveBayesOptions<T> Options { get; }

Property Value

NaiveBayesOptions<T>

Methods

ApplyGradients(Vector<T>, T)

Applies pre-computed gradients to update the model parameters.

public override void ApplyGradients(Vector<T> gradients, T learningRate)

Parameters

gradients Vector<T>

The gradient vector to apply.

learningRate T

The learning rate for the update.

Remarks

Updates parameters using: θ = θ - learningRate * gradients

For Beginners: After computing gradients (seeing which direction to move), this method actually moves the model in that direction. The learning rate controls how big of a step to take.

Distributed Training: In DDP/ZeRO-2, this applies the synchronized (averaged) gradients after communication across workers. Each worker applies the same averaged gradients to keep parameters consistent.

ComputeClassParameters(Matrix<T>, Vector<T>)

Computes class-specific parameters during training.

protected abstract void ComputeClassParameters(Matrix<T> x, Vector<T> y)

Parameters

x Matrix<T>

The input features matrix.

y Vector<T>

The target class labels vector.

Remarks

Derived classes must implement this to compute their specific parameters (e.g., mean/variance for Gaussian, feature counts for Multinomial).

ComputeGradients(Matrix<T>, Vector<T>, ILossFunction<T>?)

Computes gradients of the loss function with respect to model parameters for the given data, WITHOUT updating the model parameters.

public override Vector<T> ComputeGradients(Matrix<T> input, Vector<T> target, ILossFunction<T>? lossFunction = null)

Parameters

input Matrix<T>

The input data.

target Vector<T>

The target/expected output.

lossFunction ILossFunction<T>

The loss function to use for gradient computation. If null, uses the model's default loss function.

Returns

Vector<T>

A vector containing gradients with respect to all model parameters.

Remarks

This method performs a forward pass, computes the loss, and back-propagates to compute gradients, but does NOT update the model's parameters. The parameters remain unchanged after this call.

Distributed Training: In DDP/ZeRO-2, each worker calls this to compute local gradients on its data batch. These gradients are then synchronized (averaged) across workers before applying updates. This ensures all workers compute the same parameter updates despite having different data.

For Meta-Learning: After adapting a model on a support set, you can use this method to compute gradients on the query set. These gradients become the meta-gradients for updating the meta-parameters.

For Beginners: Think of this as "dry run" training: - The model sees what direction it should move (the gradients) - But it doesn't actually move (parameters stay the same) - You get to decide what to do with this information (average with others, inspect, modify, etc.)

Exceptions

InvalidOperationException

If lossFunction is null and the model has no default loss function.

ComputeLogLikelihood(Vector<T>, int)

Computes the log-likelihood of a sample given a class.

protected abstract T ComputeLogLikelihood(Vector<T> sample, int classIndex)

Parameters

sample Vector<T>

The feature vector for a single sample.

classIndex int

The class index.

Returns

T

The log-likelihood log P(sample|class).

ComputeLogPriors(int)

Computes the log prior probabilities for each class.

protected virtual void ComputeLogPriors(int totalSamples)

Parameters

totalSamples int

Total number of training samples.

GetClassIndex(T)

Gets the class index for a given label value.

protected int GetClassIndex(T label)

Parameters

label T

The label value.

Returns

int

The zero-based class index.

GetModelMetadata()

Gets metadata about the model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

A ModelMetadata object containing information about the model.

Remarks

This method returns metadata about the model, including its type, feature count, complexity, description, and additional information specific to classification.

For Beginners: Model metadata provides information about the model itself, rather than the predictions it makes. This includes details about the model's structure (like how many features it uses) and characteristics (like how many classes it can predict). This information can be useful for understanding and comparing different models.

GetParameters()

Gets all model parameters as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>

A vector containing all model parameters.

Remarks

This method returns a vector containing all model parameters for use with optimization algorithms or model comparison.

For Beginners: This method packages all the model's parameters into a single collection. This is useful for optimization algorithms that need to work with all parameters at once.

LogSumExp(Vector<T>)

Computes the log-sum-exp for numerical stability.

protected T LogSumExp(Vector<T> values)

Parameters

values Vector<T>

A vector of log values.

Returns

T

log(sum(exp(values))).

PredictLogProbabilities(Matrix<T>)

Predicts log-probabilities for each class (more numerically stable than probabilities).

public override Matrix<T> PredictLogProbabilities(Matrix<T> input)

Parameters

input Matrix<T>

The input features matrix.

Returns

Matrix<T>

A matrix of log-probabilities.

PredictProbabilities(Matrix<T>)

Predicts class probabilities for each sample.

public override Matrix<T> PredictProbabilities(Matrix<T> input)

Parameters

input Matrix<T>

The input features matrix.

Returns

Matrix<T>

A matrix of probabilities.

SetParameters(Vector<T>)

Sets the parameters for this model.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

A vector containing all model parameters.

Exceptions

ArgumentException

Thrown when the parameters vector has an incorrect length.

Train(Matrix<T>, Vector<T>)

Trains the Naive Bayes classifier on the provided data.

public override void Train(Matrix<T> x, Vector<T> y)

Parameters

x Matrix<T>

The input features matrix.

y Vector<T>

The target class labels vector.

WithParameters(Vector<T>)

Creates a new instance of the model with specified parameters.

public override IFullModel<T, Matrix<T>, Vector<T>> WithParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

A vector containing all model parameters.

Returns

IFullModel<T, Matrix<T>, Vector<T>>

A new model instance with the specified parameters.

Exceptions

ArgumentException

Thrown when the parameters vector has an incorrect length.