Class StepwiseRegression<T>
- Namespace
- AiDotNet.Regression
- Assembly
- AiDotNet.dll
Implements stepwise regression, which automatically selects the most relevant features for the model. This approach builds a model by adding or removing features based on their statistical significance.
public class StepwiseRegression<T> : RegressionBase<T>, IRegression<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>
Type Parameters
TThe numeric type used for calculations (typically float or double).
- Inheritance
-
StepwiseRegression<T>
- Implements
-
IRegression<T>
- Inherited Members
- Extension Methods
Remarks
Stepwise regression helps solve the feature selection problem by iteratively building a model, either by: - Forward selection: Starting with no features and adding the most significant ones - Backward elimination: Starting with all features and removing the least significant ones
At each step, the algorithm evaluates the impact of adding or removing features based on a fitness metric such as adjusted R-squared, AIC, BIC, or other statistical criteria.
For Beginners: Stepwise regression is like a smart shopping assistant that helps you pick only the most useful ingredients for a recipe.
Think of it like this:
- You have many potential ingredients (features) that might affect the outcome
- Instead of using all ingredients, which could make the recipe complicated or less tasty
- Stepwise regression tests each ingredient to see how much it improves the recipe
- It keeps only the ingredients that make a significant difference to the final result
For example, when predicting house prices, you might have data on square footage, number of bedrooms, location, age, etc. Stepwise regression would determine which of these features are most important for accurate predictions and discard the rest.
Constructors
StepwiseRegression(StepwiseRegressionOptions<T>?, PredictionStatsOptions?, IFitnessCalculator<T, Matrix<T>, Vector<T>>?, IRegularization<T, Matrix<T>, Vector<T>>?, IModelEvaluator<T, Matrix<T>, Vector<T>>?)
Creates a new stepwise regression model.
public StepwiseRegression(StepwiseRegressionOptions<T>? options = null, PredictionStatsOptions? predictionOptions = null, IFitnessCalculator<T, Matrix<T>, Vector<T>>? fitnessCalculator = null, IRegularization<T, Matrix<T>, Vector<T>>? regularization = null, IModelEvaluator<T, Matrix<T>, Vector<T>>? modelEvaluator = null)
Parameters
optionsStepwiseRegressionOptions<T>Optional configuration settings for the stepwise regression model. These settings control aspects like:
- The stepwise method (forward selection or backward elimination)
- The maximum and minimum number of features to include
- The minimum improvement required to add or remove a feature If not provided, default options will be used.
predictionOptionsPredictionStatsOptionsOptional settings for prediction statistics calculation.
fitnessCalculatorIFitnessCalculator<T, Matrix<T>, Vector<T>>Optional calculator for evaluating model fitness during feature selection. If not provided, adjusted R-squared will be used as the fitness metric.
regularizationIRegularization<T, Matrix<T>, Vector<T>>Optional regularization method to prevent overfitting. If not provided, no regularization will be applied.
modelEvaluatorIModelEvaluator<T, Matrix<T>, Vector<T>>Optional evaluator for assessing model performance. If not provided, the default model evaluator will be used.
Remarks
This constructor creates a new stepwise regression model with the specified configuration options, fitness calculator, regularization method, and model evaluator. If these components are not provided, default implementations are used.
For Beginners: This method sets up your feature selection process.
Think of it like preparing for a cooking competition:
- You decide your strategy (forward or backward selection)
- You set limits on how many ingredients to use
- You choose how you'll judge which ingredients to keep
- You set up safety measures to prevent over-complicating the recipe
After setting up with these options, the model will be ready to train and discover which features are most important for your predictions.
Methods
CreateNewInstance()
Creates a new instance of the Stepwise Regression model with the same configuration.
protected override IFullModel<T, Matrix<T>, Vector<T>> CreateNewInstance()
Returns
- IFullModel<T, Matrix<T>, Vector<T>>
A new instance of the Stepwise Regression model.
Remarks
This method creates a deep copy of the current Stepwise Regression model, including its coefficients, intercept, configuration options, selected features, fitness calculator, and model evaluator. The new instance is completely independent of the original, allowing modifications without affecting the original model.
For Beginners: This method creates an exact copy of the current regression model.
The copy includes:
- The same coefficients (the importance values for each feature)
- The same intercept (the starting point value)
- The same list of selected features (the ingredients that were chosen as important)
- The same configuration settings (like whether to use forward or backward selection)
- The same fitness calculator (the judge that evaluates model quality)
- The same model evaluator (the measurement tools that assess performance)
This is useful when you want to:
- Create a backup before further training or modification
- Create variations of the same model for different purposes
- Share the model while keeping your original intact
Exceptions
- InvalidOperationException
Thrown when the creation fails or required components are null.
Deserialize(byte[])
Deserializes the stepwise regression model from a byte array.
public override void Deserialize(byte[] data)
Parameters
databyte[]The byte array containing the serialized model data.
Remarks
This method reconstructs the model from a byte array created by the Serialize method. It restores the model's coefficients, selected features, and configuration options, allowing a previously saved model to be loaded and used for predictions.
For Beginners: This method loads a saved model from computer memory.
Think of it like opening a saved document:
- It takes the byte array created by the Serialize method
- It rebuilds all the settings, coefficients, and the list of selected features
- The model is then ready to use for making predictions
This allows you to:
- Use a previously trained model without having to train it again
- Load models that others have shared with you
- Use the same model across different applications
GetModelType()
Returns the type identifier for this regression model.
protected override ModelType GetModelType()
Returns
- ModelType
The model type identifier for stepwise regression.
Remarks
This method returns the enum value that identifies this model as a stepwise regression model. This is used for model identification in serialization/deserialization and for logging purposes.
For Beginners: This method simply tells the system what kind of model this is.
It's like a name tag for the model that says "I am a stepwise regression model." This is useful when:
- Saving the model to a file
- Loading a model from a file
- Logging information about the model
You generally won't need to call this method directly in your code.
Predict(Matrix<T>)
Makes predictions using only the selected features from the input matrix.
public override Vector<T> Predict(Matrix<T> input)
Parameters
inputMatrix<T>The input feature matrix to make predictions on.
Returns
- Vector<T>
A vector of predicted values.
Remarks
This method filters the input matrix to only include the selected features before making predictions. This is necessary because stepwise regression selects a subset of features during training.
Serialize()
Serializes the stepwise regression model to a byte array for storage or transmission.
public override byte[] Serialize()
Returns
- byte[]
A byte array containing the serialized model data.
Remarks
This method converts the model, including its coefficients, selected features, and configuration options, into a byte array. This enables the model to be saved to a file, stored in a database, or transmitted over a network.
For Beginners: This method saves the model to computer memory so you can use it later.
Think of it like taking a snapshot of the model:
- It captures all the important values, settings, and the list of selected features
- It converts them into a format that can be easily stored
- The resulting byte array can be saved to a file or database
This is useful when you want to:
- Train the model once and use it many times
- Share the model with others
- Use the model in a different application
Train(Matrix<T>, Vector<T>)
Trains the stepwise regression model using the provided input features and target values.
public override void Train(Matrix<T> x, Vector<T> y)
Parameters
xMatrix<T>The input feature matrix, where rows represent observations and columns represent features.
yVector<T>The target values vector, containing the actual output values that the model should learn to predict.
Remarks
This method implements the stepwise regression algorithm for feature selection. It: 1. Validates the input data 2. Performs either forward selection or backward elimination based on the specified method 3. Trains a final multiple regression model using only the selected features
The result is a model that uses a subset of the original features, potentially improving both interpretability and predictive performance.
For Beginners: This method discovers which features are most important and builds your model.
The training process works like this:
If using forward selection:
- Start with an empty recipe (no features)
- Try adding each available ingredient, one at a time
- Keep the ingredient that improves your recipe the most
- Repeat until adding more ingredients doesn't help much
If using backward elimination:
- Start with all ingredients in your recipe (all features)
- Try removing each ingredient, one at a time
- Remove the ingredient that hurts your recipe the least
- Repeat until removing more ingredients would harm the recipe too much
Finally, create a model using only the best ingredients (selected features)
This process helps you create a simpler, more efficient model that focuses only on the most important factors affecting your predictions.
Exceptions
- NotSupportedException
Thrown when an unsupported stepwise method is specified in the options.