Class PartialLeastSquaresRegression<T>

Namespace: AiDotNet.Regression

Assembly: AiDotNet.dll

Implements Partial Least Squares Regression (PLS), a technique that combines features from principal component analysis and multiple linear regression to handle situations with many correlated predictors.

public class PartialLeastSquaresRegression<T> : RegressionBase<T>, IRegression<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>

Type Parameters

T: The numeric data type used for calculations (e.g., float, double).

Inheritance: object

RegressionBase<T>

PartialLeastSquaresRegression<T>

Implements: IRegression<T>

IFullModel<T, Matrix<T>, Vector<T>>

IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Matrix<T>, Vector<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>

IGradientComputable<T, Matrix<T>, Vector<T>>

IJitCompilable<T>

Inherited Members: RegressionBase<T>.NumOps

RegressionBase<T>.Engine

RegressionBase<T>.Options

RegressionBase<T>.Regularization

RegressionBase<T>.Coefficients

RegressionBase<T>.Intercept

RegressionBase<T>.HasIntercept

RegressionBase<T>.FeatureNames

RegressionBase<T>.ExpectedParameterCount

RegressionBase<T>.SolveSystem(Matrix<T>, Vector<T>)

RegressionBase<T>.GetParameters()

RegressionBase<T>.WithParameters(Vector<T>)

RegressionBase<T>.GetActiveFeatureIndices()

RegressionBase<T>.IsFeatureUsed(int)

RegressionBase<T>.SetParameters(Vector<T>)

RegressionBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

RegressionBase<T>.SetCoefficientsAndIntercept(Vector<T>, T)

RegressionBase<T>.GetFeatureImportance()

RegressionBase<T>.DeepCopy()

RegressionBase<T>.Clone()

RegressionBase<T>.ParameterCount

RegressionBase<T>.DefaultLossFunction

RegressionBase<T>.ComputeGradients(Matrix<T>, Vector<T>, ILossFunction<T>)

RegressionBase<T>.ApplyGradients(Vector<T>, T)

RegressionBase<T>.SaveModel(string)

RegressionBase<T>.LoadModel(string)

RegressionBase<T>.SaveState(Stream)

RegressionBase<T>.LoadState(Stream)

RegressionBase<T>.SupportsJitCompilation

RegressionBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

Partial Least Squares Regression is particularly useful when dealing with many predictor variables that may be highly correlated. It works by finding a linear combination of the predictors (components) that maximizes the covariance between the predictors and the response variable.

Unlike Principal Component Regression which only considers the variance in the predictor variables, PLS regression considers both the variance in the predictors and their relationship with the response variable. This often leads to models with better predictive power, especially when the predictors are highly correlated.

For Beginners: Think of PLS regression as a way to find the most important patterns in your input data that are also strongly related to what you're trying to predict. It's like finding the key ingredients in a recipe that most influence the taste, rather than just the most abundant ingredients.

Constructors

PartialLeastSquaresRegression(PartialLeastSquaresRegressionOptions<T>?, IRegularization<T, Matrix<T>, Vector<T>>?)

Initializes a new instance of the PartialLeastSquaresRegression class with the specified options and regularization.

public PartialLeastSquaresRegression(PartialLeastSquaresRegressionOptions<T>? options = null, IRegularization<T, Matrix<T>, Vector<T>>? regularization = null)

Parameters

options PartialLeastSquaresRegressionOptions<T>: Configuration options for the PLS regression model. If null, default options will be used.
regularization IRegularization<T, Matrix<T>, Vector<T>>: Regularization method to prevent overfitting. If null, no regularization will be applied.

Remarks

The constructor initializes the model with either the provided options or default settings.

For Beginners: This constructor sets up the PLS regression model with your specified settings or uses default settings if none are provided. Regularization is an optional technique to prevent the model from becoming too complex and overfitting to the training data.

Methods

CalculateFeatureImportances()

Calculates the importance of each feature in the model.

protected override Vector<T> CalculateFeatureImportances()

Returns

Vector<T>: A vector containing the importance score for each feature.

Remarks

This method calculates the Variable Importance in Projection (VIP) scores, which measure the contribution of each variable to the model based on the variance explained by each PLS component and the weights of each variable in those components.

For Beginners: Feature importance tells you which input variables have the most influence on the predictions. In PLS regression, this is calculated using a measure called VIP (Variable Importance in Projection), which considers both how much each component explains the variation in the data and how much each variable contributes to those components. Higher values indicate more important variables.

CreateNewInstance()

Creates a new instance of the Partial Least Squares Regression model with the same configuration.

protected override IFullModel<T, Matrix<T>, Vector<T>> CreateNewInstance()

Returns

IFullModel<T, Matrix<T>, Vector<T>>: A new instance of the Partial Least Squares Regression model.

Remarks

This method creates a deep copy of the current Partial Least Squares Regression model, including its options, coefficients, intercept, loadings, scores, weights, and data scaling parameters. The new instance is completely independent of the original, allowing modifications without affecting the original model.

For Beginners: This method creates an exact copy of your trained model.

Think of it like making a perfect duplicate:

It copies all the configuration settings (like the number of components)
It preserves the coefficients and intercept that define your regression model
It duplicates all the internal matrices (loadings, scores, weights) that capture the patterns in your data
It maintains the scaling information (means and standard deviations) needed to process new data

Creating a copy is useful when you want to:

Create a backup before further modifying the model
Create variations of the same model for different purposes
Share the model with others while keeping your original intact

Exceptions

InvalidOperationException: Thrown when the creation fails or required components are null.

Deserialize(byte[])

Deserializes the model from a byte array.

public override void Deserialize(byte[] modelData)

Parameters

modelData byte[]: The byte array containing the serialized model data.

Remarks

This method reconstructs the model's parameters from a serialized byte array, including base class data and PLS-specific data such as loadings, scores, weights, means, and standard deviations.

For Beginners: Deserialization is the opposite of serialization - it takes the saved model data and reconstructs the model's internal state. This allows you to load a previously trained model and use it to make predictions without having to retrain it. It's like loading a saved game to continue where you left off.

GetModelMetadata()

Gets metadata about the model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>: A ModelMetadata object containing information about the model.

Remarks

This method returns metadata about the model, including its type, coefficients, loadings, scores, weights, number of components, and feature importance.

For Beginners: Model metadata provides information about the model itself, rather than the predictions it makes. This includes details about how the model is configured (like how many components it uses) and information about the importance of different features. This can help you understand which input variables are most influential in making predictions.

GetModelType()

Gets the type of the model.

protected override ModelType GetModelType()

Returns

ModelType: The model type identifier for partial least squares regression.

Remarks

This method is used for model identification and serialization purposes.

For Beginners: This method simply returns an identifier that indicates this is a partial least squares regression model. It's used internally by the library to keep track of different types of models.

Predict(Matrix<T>)

Makes predictions for the given input data.

public override Vector<T> Predict(Matrix<T> input)

Parameters

input Matrix<T>: The input features matrix where each row is an example and each column is a feature.

Returns

Vector<T>: A vector of predicted values for each input example.

Remarks

This method scales the input data using the means and standard deviations from the training data, applies the regression coefficients, and adds the intercept to produce predictions.

For Beginners: After training, this method is used to make predictions on new data. It first scales your input data the same way the training data was scaled, then applies the learned model to calculate the predicted values. This is the main purpose of building a regression model - to predict values for new examples.

Serialize()

Serializes the model to a byte array.

public override byte[] Serialize()

Returns

byte[]: A byte array containing the serialized model data.

Remarks

This method serializes the model's parameters, including base class data and PLS-specific data such as loadings, scores, weights, means, and standard deviations.

For Beginners: Serialization converts the model's internal state into a format that can be saved to disk or transmitted over a network. This allows you to save a trained model and load it later without having to retrain it. Think of it like saving your progress in a video game.

Train(Matrix<T>, Vector<T>)

Trains the partial least squares regression model on the provided data.

public override void Train(Matrix<T> x, Vector<T> y)

Parameters

x Matrix<T>: The input features matrix where each row is a training example and each column is a feature.
y Vector<T>: The target values vector corresponding to each training example.

Remarks

This method performs the following steps: 1. Validates the input data 2. Centers and scales the data 3. Extracts the specified number of components using the NIPALS algorithm 4. Calculates the regression coefficients 5. Adjusts the coefficients for the scaling 6. Calculates the intercept 7. Applies regularization to the model matrices

For Beginners: Training is the process where the model learns from your data. The PLS algorithm first centers and scales your data (makes all variables have similar ranges), then finds the most important patterns (components) that explain both the variation in your input features and their relationship with the target variable. These components are then used to build a regression model that can predict the target variable from new input features.

Table of Contents

Class PartialLeastSquaresRegression<T>

Type Parameters

Remarks

Constructors

PartialLeastSquaresRegression(PartialLeastSquaresRegressionOptions<T>?, IRegularization<T, Matrix<T>, Vector<T>>?)

Parameters

Remarks

Methods

CalculateFeatureImportances()

Returns

Remarks

CreateNewInstance()

Returns

Remarks

Exceptions

Deserialize(byte[])

Parameters

Remarks

GetModelMetadata()

Returns

Remarks

GetModelType()

Returns

Remarks

Predict(Matrix<T>)

Parameters

Returns

Remarks

Serialize()

Returns

Remarks

Train(Matrix<T>, Vector<T>)

Parameters

Remarks