Table of Contents

Class LogisticRegression<T>

Namespace
AiDotNet.Regression
Assembly
AiDotNet.dll

Represents a logistic regression model for binary classification problems.

public class LogisticRegression<T> : RegressionBase<T>, IRegression<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
LogisticRegression<T>
Implements
IFullModel<T, Matrix<T>, Vector<T>>
IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>
IParameterizable<T, Matrix<T>, Vector<T>>
ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>
IGradientComputable<T, Matrix<T>, Vector<T>>
Inherited Members
Extension Methods

Remarks

Logistic regression is a statistical method used for binary classification tasks, where the goal is to predict one of two possible outcomes (such as yes/no, true/false, 0/1). Unlike linear regression, which predicts continuous values, logistic regression outputs probabilities between 0 and 1, which can be interpreted as the likelihood of belonging to the positive class.

For Beginners: Logistic regression is like a decision-maker that predicts whether something belongs to one category or another.

Think of it like determining whether an email is spam or not:

  • The model looks at different "features" of the email (like certain words or sender information)
  • It calculates how much each feature suggests "spam" or "not spam"
  • It combines all this information to make a final prediction between 0 and 1
  • Values closer to 1 mean "more likely spam", values closer to 0 mean "more likely not spam"

For example, words like "free" or "offer" might increase the spam probability, while emails from your contacts might decrease it. Logistic regression finds the right balance of these factors to make accurate predictions.

Constructors

LogisticRegression(LogisticRegressionOptions<T>?, IRegularization<T, Matrix<T>, Vector<T>>?)

Initializes a new instance of the LogisticRegression<T> class with optional custom options and regularization.

public LogisticRegression(LogisticRegressionOptions<T>? options = null, IRegularization<T, Matrix<T>, Vector<T>>? regularization = null)

Parameters

options LogisticRegressionOptions<T>

Custom options for the logistic regression algorithm. If null, default options are used.

regularization IRegularization<T, Matrix<T>, Vector<T>>

Regularization method to prevent overfitting. If null, no regularization is applied.

Remarks

This constructor creates a new logistic regression model with the specified options and regularization. If no options are provided, default values are used. Regularization helps prevent overfitting by penalizing large coefficient values.

For Beginners: This creates a new logistic regression model with your chosen settings.

When creating a logistic regression model:

  • You can provide custom settings (options) or use the defaults
  • You can add regularization, which helps the model generalize better to new data

Regularization is like adding training wheels to prevent the model from memorizing the training data too closely, which would make it perform poorly on new, unseen data.

Methods

CreateNewInstance()

Creates a new instance of the logistic regression model.

protected override IFullModel<T, Matrix<T>, Vector<T>> CreateNewInstance()

Returns

IFullModel<T, Matrix<T>, Vector<T>>

A new instance of the logistic regression model with the same configuration.

Remarks

This method creates a new instance of the logistic regression model with the same configuration as the current instance. It is used internally during serialization/deserialization to create a new instance of the model.

For Beginners: This method creates a copy of the model structure without copying the learned data.

It's like creating a new, empty notebook with the same number of pages and section dividers as your current notebook, but without copying any of the notes you've written. This is useful when you want to create a similar model or when loading a saved model from a file.

Deserialize(byte[])

Deserializes the logistic regression model from a byte array.

public override void Deserialize(byte[] modelData)

Parameters

modelData byte[]

A byte array containing the serialized model data.

Remarks

This method restores a logistic regression model from a serialized byte array, reconstructing its parameters and configuration. This allows a previously trained model to be loaded from storage or after being received over a network.

For Beginners: This rebuilds the model from a saved format.

Deserialization:

  • Takes a sequence of bytes that represents a model
  • Reconstructs the original model with all its learned patterns
  • Allows you to use a previously trained model without retraining

Think of it like unpacking a model that was packed up for storage or shipping, so you can use it again exactly as it was.

GetModelType()

Gets the type of regression model.

protected override ModelType GetModelType()

Returns

ModelType

The model type, in this case, LogisticRegression.

Remarks

This method returns an enumeration value indicating that this is a logistic regression model. This is used for type identification when working with different regression models in a unified manner.

For Beginners: This simply tells other parts of the program what kind of model this is.

When you have different types of models in your program:

  • Each model needs to identify itself
  • This method returns a label (LogisticRegression) that identifies this specific type
  • Other code can use this label to handle the model appropriately

It's like having different types of vehicles (cars, trucks, motorcycles) that each need to be serviced differently.

Predict(Matrix<T>)

Makes predictions for new data points using the trained logistic regression model.

public override Vector<T> Predict(Matrix<T> x)

Parameters

x Matrix<T>

The feature matrix where each row is a sample and each column is a feature.

Returns

Vector<T>

A vector of predicted probabilities for the positive class.

Remarks

This method calculates the predicted probabilities for each sample in the input feature matrix. It computes the raw scores by multiplying the features by the learned coefficients and adding the intercept, then transforms these scores into probabilities using the sigmoid function.

For Beginners: This is where the model makes predictions on new data.

During prediction:

  • The model calculates a score for each example using the learned weights
  • The score is converted to a probability between 0 and 1 using the sigmoid function
  • Values closer to 1 indicate higher confidence in the positive class

For instance, if you've trained the model to detect fraudulent transactions, a probability of 0.92 would suggest the transaction is likely fraudulent, while 0.03 would suggest it's probably legitimate.

Serialize()

Serializes the logistic regression model to a byte array for storage or transmission.

public override byte[] Serialize()

Returns

byte[]

A byte array containing the serialized model data.

Remarks

This method converts the entire logistic regression model, including its parameters and configuration, into a byte array that can be stored in a file or database, or transmitted over a network. The model can later be restored using the Deserialize method.

For Beginners: This converts the model into a format that can be saved or shared.

Serialization:

  • Transforms the model into a sequence of bytes
  • Preserves all the important information about the model
  • Allows you to save the trained model to a file
  • Lets you load the model later without having to retrain it

It's like taking a snapshot of the model that you can use later or share with others.

Train(Matrix<T>, Vector<T>)

Trains the logistic regression model using the provided features and target values.

public override void Train(Matrix<T> x, Vector<T> y)

Parameters

x Matrix<T>

The feature matrix where each row is a sample and each column is a feature.

y Vector<T>

The target vector containing the binary labels (0 or 1) for each sample.

Remarks

This method trains the logistic regression model using gradient ascent to maximize the likelihood of the observed data. The algorithm iteratively updates the coefficients and intercept based on the prediction errors until convergence or until the maximum number of iterations is reached. Regularization is applied if specified.

For Beginners: This is where the model learns from your data.

During training:

  • The model starts with initial guesses for how important each feature is
  • It makes predictions based on these guesses
  • It compares its predictions with the actual answers
  • It adjusts its guesses to reduce errors
  • This process repeats until the model stops improving significantly

For example, with email classification, the model might learn that the word "meeting" is a strong indicator of a legitimate email, while "click here to claim" suggests spam.

Exceptions

ArgumentException

Thrown when the number of rows in X does not match the length of y.