Table of Contents

Class StepwiseRegression<T>

Namespace
AiDotNet.Regression
Assembly
AiDotNet.dll

Implements stepwise regression, which automatically selects the most relevant features for the model. This approach builds a model by adding or removing features based on their statistical significance.

public class StepwiseRegression<T> : RegressionBase<T>, IRegression<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>

Type Parameters

T

The numeric type used for calculations (typically float or double).

Inheritance
StepwiseRegression<T>
Implements
IFullModel<T, Matrix<T>, Vector<T>>
IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>
IParameterizable<T, Matrix<T>, Vector<T>>
ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>
IGradientComputable<T, Matrix<T>, Vector<T>>
Inherited Members
Extension Methods

Remarks

Stepwise regression helps solve the feature selection problem by iteratively building a model, either by: - Forward selection: Starting with no features and adding the most significant ones - Backward elimination: Starting with all features and removing the least significant ones

At each step, the algorithm evaluates the impact of adding or removing features based on a fitness metric such as adjusted R-squared, AIC, BIC, or other statistical criteria.

For Beginners: Stepwise regression is like a smart shopping assistant that helps you pick only the most useful ingredients for a recipe.

Think of it like this:

  • You have many potential ingredients (features) that might affect the outcome
  • Instead of using all ingredients, which could make the recipe complicated or less tasty
  • Stepwise regression tests each ingredient to see how much it improves the recipe
  • It keeps only the ingredients that make a significant difference to the final result

For example, when predicting house prices, you might have data on square footage, number of bedrooms, location, age, etc. Stepwise regression would determine which of these features are most important for accurate predictions and discard the rest.

Constructors

StepwiseRegression(StepwiseRegressionOptions<T>?, PredictionStatsOptions?, IFitnessCalculator<T, Matrix<T>, Vector<T>>?, IRegularization<T, Matrix<T>, Vector<T>>?, IModelEvaluator<T, Matrix<T>, Vector<T>>?)

Creates a new stepwise regression model.

public StepwiseRegression(StepwiseRegressionOptions<T>? options = null, PredictionStatsOptions? predictionOptions = null, IFitnessCalculator<T, Matrix<T>, Vector<T>>? fitnessCalculator = null, IRegularization<T, Matrix<T>, Vector<T>>? regularization = null, IModelEvaluator<T, Matrix<T>, Vector<T>>? modelEvaluator = null)

Parameters

options StepwiseRegressionOptions<T>

Optional configuration settings for the stepwise regression model. These settings control aspects like:

  • The stepwise method (forward selection or backward elimination)
  • The maximum and minimum number of features to include
  • The minimum improvement required to add or remove a feature If not provided, default options will be used.
predictionOptions PredictionStatsOptions

Optional settings for prediction statistics calculation.

fitnessCalculator IFitnessCalculator<T, Matrix<T>, Vector<T>>

Optional calculator for evaluating model fitness during feature selection. If not provided, adjusted R-squared will be used as the fitness metric.

regularization IRegularization<T, Matrix<T>, Vector<T>>

Optional regularization method to prevent overfitting. If not provided, no regularization will be applied.

modelEvaluator IModelEvaluator<T, Matrix<T>, Vector<T>>

Optional evaluator for assessing model performance. If not provided, the default model evaluator will be used.

Remarks

This constructor creates a new stepwise regression model with the specified configuration options, fitness calculator, regularization method, and model evaluator. If these components are not provided, default implementations are used.

For Beginners: This method sets up your feature selection process.

Think of it like preparing for a cooking competition:

  • You decide your strategy (forward or backward selection)
  • You set limits on how many ingredients to use
  • You choose how you'll judge which ingredients to keep
  • You set up safety measures to prevent over-complicating the recipe

After setting up with these options, the model will be ready to train and discover which features are most important for your predictions.

Methods

CreateNewInstance()

Creates a new instance of the Stepwise Regression model with the same configuration.

protected override IFullModel<T, Matrix<T>, Vector<T>> CreateNewInstance()

Returns

IFullModel<T, Matrix<T>, Vector<T>>

A new instance of the Stepwise Regression model.

Remarks

This method creates a deep copy of the current Stepwise Regression model, including its coefficients, intercept, configuration options, selected features, fitness calculator, and model evaluator. The new instance is completely independent of the original, allowing modifications without affecting the original model.

For Beginners: This method creates an exact copy of the current regression model.

The copy includes:

  • The same coefficients (the importance values for each feature)
  • The same intercept (the starting point value)
  • The same list of selected features (the ingredients that were chosen as important)
  • The same configuration settings (like whether to use forward or backward selection)
  • The same fitness calculator (the judge that evaluates model quality)
  • The same model evaluator (the measurement tools that assess performance)

This is useful when you want to:

  • Create a backup before further training or modification
  • Create variations of the same model for different purposes
  • Share the model while keeping your original intact

Exceptions

InvalidOperationException

Thrown when the creation fails or required components are null.

Deserialize(byte[])

Deserializes the stepwise regression model from a byte array.

public override void Deserialize(byte[] data)

Parameters

data byte[]

The byte array containing the serialized model data.

Remarks

This method reconstructs the model from a byte array created by the Serialize method. It restores the model's coefficients, selected features, and configuration options, allowing a previously saved model to be loaded and used for predictions.

For Beginners: This method loads a saved model from computer memory.

Think of it like opening a saved document:

  • It takes the byte array created by the Serialize method
  • It rebuilds all the settings, coefficients, and the list of selected features
  • The model is then ready to use for making predictions

This allows you to:

  • Use a previously trained model without having to train it again
  • Load models that others have shared with you
  • Use the same model across different applications

GetModelType()

Returns the type identifier for this regression model.

protected override ModelType GetModelType()

Returns

ModelType

The model type identifier for stepwise regression.

Remarks

This method returns the enum value that identifies this model as a stepwise regression model. This is used for model identification in serialization/deserialization and for logging purposes.

For Beginners: This method simply tells the system what kind of model this is.

It's like a name tag for the model that says "I am a stepwise regression model." This is useful when:

  • Saving the model to a file
  • Loading a model from a file
  • Logging information about the model

You generally won't need to call this method directly in your code.

Predict(Matrix<T>)

Makes predictions using only the selected features from the input matrix.

public override Vector<T> Predict(Matrix<T> input)

Parameters

input Matrix<T>

The input feature matrix to make predictions on.

Returns

Vector<T>

A vector of predicted values.

Remarks

This method filters the input matrix to only include the selected features before making predictions. This is necessary because stepwise regression selects a subset of features during training.

Serialize()

Serializes the stepwise regression model to a byte array for storage or transmission.

public override byte[] Serialize()

Returns

byte[]

A byte array containing the serialized model data.

Remarks

This method converts the model, including its coefficients, selected features, and configuration options, into a byte array. This enables the model to be saved to a file, stored in a database, or transmitted over a network.

For Beginners: This method saves the model to computer memory so you can use it later.

Think of it like taking a snapshot of the model:

  • It captures all the important values, settings, and the list of selected features
  • It converts them into a format that can be easily stored
  • The resulting byte array can be saved to a file or database

This is useful when you want to:

  • Train the model once and use it many times
  • Share the model with others
  • Use the model in a different application

Train(Matrix<T>, Vector<T>)

Trains the stepwise regression model using the provided input features and target values.

public override void Train(Matrix<T> x, Vector<T> y)

Parameters

x Matrix<T>

The input feature matrix, where rows represent observations and columns represent features.

y Vector<T>

The target values vector, containing the actual output values that the model should learn to predict.

Remarks

This method implements the stepwise regression algorithm for feature selection. It: 1. Validates the input data 2. Performs either forward selection or backward elimination based on the specified method 3. Trains a final multiple regression model using only the selected features

The result is a model that uses a subset of the original features, potentially improving both interpretability and predictive performance.

For Beginners: This method discovers which features are most important and builds your model.

The training process works like this:

  1. If using forward selection:

    • Start with an empty recipe (no features)
    • Try adding each available ingredient, one at a time
    • Keep the ingredient that improves your recipe the most
    • Repeat until adding more ingredients doesn't help much
  2. If using backward elimination:

    • Start with all ingredients in your recipe (all features)
    • Try removing each ingredient, one at a time
    • Remove the ingredient that hurts your recipe the least
    • Repeat until removing more ingredients would harm the recipe too much
  3. Finally, create a model using only the best ingredients (selected features)

This process helps you create a simpler, more efficient model that focuses only on the most important factors affecting your predictions.

Exceptions

NotSupportedException

Thrown when an unsupported stepwise method is specified in the options.