Class RegressionBase<T>
- Namespace
- AiDotNet.Regression
- Assembly
- AiDotNet.dll
Provides a base implementation for regression algorithms that model the relationship between a dependent variable and one or more independent variables.
public abstract class RegressionBase<T> : IRegression<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>
Type Parameters
TThe numeric data type used for calculations (e.g., float, double).
- Inheritance
-
RegressionBase<T>
- Implements
-
IRegression<T>
- Derived
- Inherited Members
- Extension Methods
Remarks
This abstract class implements common functionality for regression models, including prediction, serialization/deserialization, and solving linear systems. Specific regression algorithms should inherit from this class and implement the Train method.
The class supports various options like regularization to prevent overfitting and different decomposition methods for solving linear systems.
For Beginners: Regression is a statistical method for modeling the relationship between variables. This base class provides the foundation for different regression techniques, handling common operations like making predictions and saving/loading models. Think of it as a template that specific regression algorithms can customize while reusing the shared functionality.
Constructors
RegressionBase(RegressionOptions<T>?, IRegularization<T, Matrix<T>, Vector<T>>?, ILossFunction<T>?)
Initializes a new instance of the RegressionBase class with the specified options and regularization.
protected RegressionBase(RegressionOptions<T>? options = null, IRegularization<T, Matrix<T>, Vector<T>>? regularization = null, ILossFunction<T>? lossFunction = null)
Parameters
optionsRegressionOptions<T>Configuration options for the regression model. If null, default options will be used.
regularizationIRegularization<T, Matrix<T>, Vector<T>>Regularization method to prevent overfitting. If null, no regularization will be applied.
lossFunctionILossFunction<T>Loss function for gradient computation. If null, defaults to Mean Squared Error.
Remarks
The constructor initializes the model with either the provided options or default settings.
For Beginners: This constructor sets up the regression model with your specified settings or uses default settings if none are provided. Regularization is an optional technique to prevent the model from becoming too complex and overfitting to the training data. The loss function determines how prediction errors are measured during training.
Properties
Coefficients
Gets or sets the coefficients (weights) of the regression model.
public Vector<T> Coefficients { get; protected set; }
Property Value
- Vector<T>
A vector of coefficients, one for each feature.
DefaultLossFunction
Gets the default loss function used by this model for gradient computation.
public virtual ILossFunction<T> DefaultLossFunction { get; }
Property Value
Remarks
This loss function is used when calling ComputeGradients(TInput, TOutput, ILossFunction<T>?) without explicitly providing a loss function. It represents the model's primary training objective.
For Beginners: The loss function tells the model "what counts as a mistake". For example: - For regression (predicting numbers): Mean Squared Error measures how far predictions are from actual values - For classification (predicting categories): Cross Entropy measures how confident the model is in the right category
This property provides a sensible default so you don't have to specify the loss function every time, but you can still override it if needed for special cases.
Distributed Training: In distributed training, all workers use the same loss function to ensure consistent gradient computation. The default loss function is automatically used when workers compute local gradients.
Exceptions
- InvalidOperationException
Thrown if accessed before the model has been configured with a loss function.
Engine
Gets the global execution engine for vector operations.
protected IEngine Engine { get; }
Property Value
- IEngine
Remarks
This property provides access to the execution engine (CPU or GPU) for performing vectorized operations. The engine is determined by the global AiDotNetEngine configuration and allows automatic fallback from GPU to CPU when GPU is not available.
For Beginners: This gives access to either CPU or GPU processing for faster computations. The system automatically chooses the best available option and falls back to CPU if GPU acceleration is not available.
ExpectedParameterCount
Gets the expected number of parameters (coefficients plus intercept if used).
protected int ExpectedParameterCount { get; }
Property Value
- int
The total number of parameters, which equals the number of coefficients plus 1 if an intercept is used, or just the number of coefficients otherwise.
FeatureNames
Gets or sets the feature names.
public string[]? FeatureNames { get; set; }
Property Value
- string[]
An array of feature names. If not set, feature indices will be used as names.
HasIntercept
Gets a value indicating whether the model includes an intercept term.
public bool HasIntercept { get; }
Property Value
- bool
True if the model includes an intercept; otherwise, false.
Intercept
Gets or sets the intercept (bias) term of the regression model.
public T Intercept { get; protected set; }
Property Value
- T
The intercept value.
NumOps
Gets the numeric operations for the specified type T.
protected INumericOperations<T> NumOps { get; }
Property Value
- INumericOperations<T>
An object that provides mathematical operations for type T.
Options
Gets the regression options.
protected RegressionOptions<T> Options { get; }
Property Value
- RegressionOptions<T>
Configuration options for the regression model.
ParameterCount
Gets the number of parameters in the model.
public virtual int ParameterCount { get; }
Property Value
Remarks
This property returns the total count of trainable parameters in the model. It's useful for understanding model complexity and memory requirements.
Regularization
Gets the regularization method used to prevent overfitting.
protected IRegularization<T, Matrix<T>, Vector<T>> Regularization { get; }
Property Value
- IRegularization<T, Matrix<T>, Vector<T>>
An object that implements regularization for the regression model.
SupportsJitCompilation
Gets whether this model currently supports JIT compilation.
public virtual bool SupportsJitCompilation { get; }
Property Value
- bool
True if the model can be JIT compiled, false otherwise.
Remarks
Some models may not support JIT compilation due to: - Dynamic graph structure (changes based on input) - Lack of computation graph representation - Use of operations not yet supported by the JIT compiler
For Beginners: This tells you whether this specific model can benefit from JIT compilation.
Models return false if they:
- Use layer-based architecture without graph export (e.g., current neural networks)
- Have control flow that changes based on input data
- Use operations the JIT compiler doesn't understand yet
In these cases, the model will still work normally, just without JIT acceleration.
Methods
ApplyGradients(Vector<T>, T)
Applies pre-computed gradients to update the model parameters.
public virtual void ApplyGradients(Vector<T> gradients, T learningRate)
Parameters
gradientsVector<T>The gradient vector to apply.
learningRateTThe learning rate for the update.
Remarks
Updates parameters using: θ = θ - learningRate * gradients
For Beginners: After computing gradients (seeing which direction to move), this method actually moves the model in that direction. The learning rate controls how big of a step to take.
Distributed Training: In DDP/ZeRO-2, this applies the synchronized (averaged) gradients after communication across workers. Each worker applies the same averaged gradients to keep parameters consistent.
CalculateFeatureImportances()
Calculates the importance of each feature in the model.
protected virtual Vector<T> CalculateFeatureImportances()
Returns
- Vector<T>
A vector of feature importances.
Remarks
This method calculates feature importances based on the absolute values of the coefficients. Derived classes may override this method to provide more sophisticated feature importance calculations.
For Beginners: Feature importance tells you which input variables have the most influence on the predictions. In basic regression models, this is calculated from the absolute values of the coefficients - larger coefficients (ignoring sign) indicate more important features.
Clone()
Creates a clone of the regression model.
public virtual IFullModel<T, Matrix<T>, Vector<T>> Clone()
Returns
- IFullModel<T, Matrix<T>, Vector<T>>
A new instance of the model with the same parameters and options.
Remarks
This method creates a new instance of the regression model with the same parameters and configuration options as the current instance. Derived classes should override this method to provide proper cloning behavior specific to their implementation.
For Beginners: This method creates an exact independent copy of your model.
Cloning a model means creating a new model that's exactly the same as the original, including all its learned parameters and settings. However, the clone is independent - changes to one model won't affect the other.
Think of it like photocopying a document - the copy has all the same information, but you can mark up the copy without changing the original.
Note: Specific regression algorithms will customize this method to ensure all their unique properties are properly copied.
ComputeGradients(Matrix<T>, Vector<T>, ILossFunction<T>?)
Computes gradients of the loss function with respect to model parameters for the given data, WITHOUT updating the model parameters.
public virtual Vector<T> ComputeGradients(Matrix<T> input, Vector<T> target, ILossFunction<T>? lossFunction = null)
Parameters
inputMatrix<T>The input data.
targetVector<T>The target/expected output.
lossFunctionILossFunction<T>The loss function to use for gradient computation. If null, uses the model's default loss function.
Returns
- Vector<T>
A vector containing gradients with respect to all model parameters.
Remarks
This method performs a forward pass, computes the loss, and back-propagates to compute gradients, but does NOT update the model's parameters. The parameters remain unchanged after this call.
Distributed Training: In DDP/ZeRO-2, each worker calls this to compute local gradients on its data batch. These gradients are then synchronized (averaged) across workers before applying updates. This ensures all workers compute the same parameter updates despite having different data.
For Meta-Learning: After adapting a model on a support set, you can use this method to compute gradients on the query set. These gradients become the meta-gradients for updating the meta-parameters.
For Beginners: Think of this as "dry run" training: - The model sees what direction it should move (the gradients) - But it doesn't actually move (parameters stay the same) - You get to decide what to do with this information (average with others, inspect, modify, etc.)
Exceptions
- InvalidOperationException
If lossFunction is null and the model has no default loss function.
CreateNewInstance()
Creates a new instance of the same type as this neural network.
protected abstract IFullModel<T, Matrix<T>, Vector<T>> CreateNewInstance()
Returns
- IFullModel<T, Matrix<T>, Vector<T>>
A new instance of the same neural network type.
Remarks
For Beginners: This creates a blank version of the same type of neural network.
It's used internally by methods like DeepCopy and Clone to create the right type of network before copying the data into it.
DeepCopy()
Creates a deep copy of the regression model.
public virtual IFullModel<T, Matrix<T>, Vector<T>> DeepCopy()
Returns
- IFullModel<T, Matrix<T>, Vector<T>>
A new instance of the model with the same parameters and options.
Remarks
This method creates a new instance of the regression model with the same parameters and configuration options as the current instance.
For Beginners: This method creates an exact independent copy of your model.
The copy has the same:
- Coefficients (weights for each feature)
- Intercept (base prediction value)
- Configuration options (like regularization settings)
But it's completely separate from the original model - changes to one won't affect the other.
This is useful when you want to:
- Experiment with modifying a model without affecting the original
- Create multiple similar models to use in different contexts
- Save a "checkpoint" of your model before making changes
Deserialize(byte[])
Deserializes the model from a byte array.
public virtual void Deserialize(byte[] modelData)
Parameters
modelDatabyte[]The byte array containing the serialized model data.
Remarks
This method reconstructs the model's parameters from a serialized byte array, including coefficients, intercept, and regularization options.
For Beginners: Deserialization is the opposite of serialization - it takes the saved model data and reconstructs the model's internal state. This allows you to load a previously trained model and use it to make predictions without having to retrain it. It's like loading a saved game to continue where you left off.
Exceptions
- InvalidOperationException
Thrown when deserialization fails.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the model's computation graph for JIT compilation.
public virtual ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes (parameters).
Returns
- ComputationNode<T>
The output computation node representing the model's prediction.
Remarks
This method should construct a computation graph representing the model's forward pass. The graph should use placeholder input nodes that will be filled with actual data during execution.
For Beginners: This method creates a "recipe" of your model's calculations that the JIT compiler can optimize.
The method should:
- Create placeholder nodes for inputs (features, parameters)
- Build the computation graph using TensorOperations
- Return the final output node
- Add all input nodes to the inputNodes list (in order)
Example for a simple linear model (y = Wx + b):
public ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
{
// Create placeholder inputs
var x = TensorOperations<T>.Variable(new Tensor<T>(InputShape), "x");
var W = TensorOperations<T>.Variable(Weights, "W");
var b = TensorOperations<T>.Variable(Bias, "b");
// Add inputs in order
inputNodes.Add(x);
inputNodes.Add(W);
inputNodes.Add(b);
// Build graph: y = Wx + b
var matmul = TensorOperations<T>.MatMul(x, W);
var output = TensorOperations<T>.Add(matmul, b);
return output;
}
The JIT compiler will then:
- Optimize the graph (fuse operations, eliminate dead code)
- Compile it to fast native code
- Cache the compiled version for reuse
GetActiveFeatureIndices()
Gets the indices of features that are actively used in the model.
public virtual IEnumerable<int> GetActiveFeatureIndices()
Returns
- IEnumerable<int>
An enumerable collection of indices for features with non-zero coefficients.
Remarks
This method identifies which features are actually contributing to the model's predictions by returning the indices of all features with non-zero coefficients.
For Beginners: This method tells you which input features actually matter in the model.
Not all features necessarily contribute to predictions. Some might have coefficients of zero, meaning they're effectively ignored by the model. This method returns the positions (indices) of features that do have an effect on predictions.
For example, if your model has 10 features but only features at positions 2, 5, and 7 have non-zero coefficients, this method would return [2, 5, 7].
This is useful for:
- Feature selection (identifying which features are most important)
- Model simplification (removing unused features)
- Understanding which inputs actually affect the prediction
GetFeatureImportance()
Gets the feature importance scores as a dictionary.
public virtual Dictionary<string, T> GetFeatureImportance()
Returns
- Dictionary<string, T>
A dictionary mapping feature names to their importance scores.
Remarks
This method returns feature importance scores based on the absolute values of coefficients. If feature names are not available, it uses indices as names (e.g., "Feature_0", "Feature_1").
For Beginners: This method tells you which features are most important.
It returns a dictionary where:
- Keys are feature names (or "Feature_0", "Feature_1", etc. if names aren't set)
- Values are importance scores (higher means more important)
In regression models, importance is typically based on the absolute value of coefficients.
GetModelMetadata()
Gets metadata about the model.
public virtual ModelMetadata<T> GetModelMetadata()
Returns
- ModelMetadata<T>
A ModelMetadata object containing information about the model.
Remarks
This method returns metadata about the model, including its type, feature count, complexity, description, and additional information like coefficient norm and feature importances.
For Beginners: Model metadata provides information about the model itself, rather than the predictions it makes. This includes details about the model's structure (like how many features it uses) and characteristics (like which features are most important). This information can be useful for understanding and comparing different models.
GetModelType()
Gets the type of the model.
protected abstract ModelType GetModelType()
Returns
- ModelType
The model type identifier.
Remarks
This abstract method must be implemented by derived classes to specify the model type.
For Beginners: This method simply returns an identifier that indicates what type of regression model this is (e.g., linear regression, ridge regression). It's used internally by the library to keep track of different types of models.
GetParameters()
Gets all model parameters (coefficients and intercept) as a single vector.
public virtual Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all model parameters.
Remarks
This method returns a vector containing all model parameters (coefficients followed by intercept) for use with optimization algorithms or model comparison.
For Beginners: This method packages all the model's parameters into a single collection.
Think of the parameters as the "recipe" for your model's predictions:
- The coefficients represent how much each feature contributes to the prediction
- The intercept is the baseline prediction when all features are zero
Getting all parameters at once allows tools to optimize the model or compare different models. For example, an optimization algorithm might try different combinations of parameters to find the ones that give the most accurate predictions.
IsFeatureUsed(int)
Determines whether a specific feature is used in the model.
public virtual bool IsFeatureUsed(int featureIndex)
Parameters
featureIndexintThe zero-based index of the feature to check.
Returns
- bool
True if the feature has a non-zero coefficient; otherwise, false.
Remarks
This method checks whether a specific feature is actively contributing to the model's predictions by verifying if its corresponding coefficient is non-zero.
For Beginners: This method checks if a specific input feature affects the model's predictions.
You provide the position (index) of a feature, and the method tells you whether that feature is actually used in making predictions. A feature is considered "used" if its coefficient is not zero.
For example, if feature #3 has a coefficient of 0, this method would return false because that feature doesn't affect the model's output.
This is useful when you want to check a specific feature's importance rather than getting all important features at once.
Exceptions
- ArgumentOutOfRangeException
Thrown when the feature index is outside the valid range.
LoadModel(string)
Loads a regression model from a file.
public virtual void LoadModel(string filePath)
Parameters
filePathstringThe path to the file containing the saved model.
Remarks
This method loads the complete state of the regression model from a file, including coefficients, intercept, and all configuration options.
For Beginners: This loads a previously trained model from a file.
It's like loading a saved recipe:
- It restores all the model's learned parameters
- It restores the configuration settings
- The model is immediately ready to make predictions
This allows you to:
- Reuse models without retraining
- Share models with others
- Deploy models to production environments
LoadState(Stream)
Loads the model's state from a stream.
public virtual void LoadState(Stream stream)
Parameters
streamStreamThe stream to read the model state from.
Predict(Matrix<T>)
Makes predictions for the given input data.
public virtual Vector<T> Predict(Matrix<T> input)
Parameters
inputMatrix<T>The input features matrix where each row is an example and each column is a feature.
Returns
- Vector<T>
A vector of predicted values for each input example.
Remarks
This method calculates predictions by multiplying the input features by the model coefficients and adding the intercept if one is used.
For Beginners: After training, this method is used to make predictions on new data. It applies the learned coefficients to the input features and adds the intercept (if used) to produce the final prediction. For linear regression, this is simply the dot product of the features and coefficients plus the intercept.
SaveModel(string)
Saves the regression model to a file.
public virtual void SaveModel(string filePath)
Parameters
filePathstringThe path where the model should be saved.
Remarks
This method saves the complete state of the regression model, including coefficients, intercept, and all configuration options, to a file.
For Beginners: This saves your trained model to a file so you can use it later.
Think of it like saving a recipe:
- It captures all the model's learned parameters (coefficients and intercept)
- It saves the configuration settings used to train the model
- You can load it later to make predictions without retraining
This is useful for:
- Deploying models to production
- Sharing models with others
- Avoiding the need to retrain on the same data
SaveState(Stream)
Saves the model's current state to a stream.
public virtual void SaveState(Stream stream)
Parameters
streamStreamThe stream to write the model state to.
Serialize()
Serializes the model to a byte array.
public virtual byte[] Serialize()
Returns
- byte[]
A byte array containing the serialized model data.
Remarks
This method serializes the model's parameters, including coefficients, intercept, and regularization options, to a JSON format and then converts it to a byte array.
For Beginners: Serialization converts the model's internal state into a format that can be saved to disk or transmitted over a network. This allows you to save a trained model and load it later without having to retrain it. Think of it like saving your progress in a video game.
SetActiveFeatureIndices(IEnumerable<int>)
Sets the active feature indices for this model.
public virtual void SetActiveFeatureIndices(IEnumerable<int> featureIndices)
Parameters
featureIndicesIEnumerable<int>The indices of features to activate.
Remarks
This method sets the coefficients for the specified features to their current values and sets all other coefficients to zero, effectively activating only the specified features.
For Beginners: This method selectively activates only certain features.
You provide a list of feature positions (indices), and the method will:
- Keep the coefficients for those features
- Set all other feature coefficients to zero
This is useful for feature selection, where you want to use only a subset of available features.
SetCoefficientsAndIntercept(Vector<T>, T)
Sets the coefficients and intercept directly for deserialization purposes.
public virtual void SetCoefficientsAndIntercept(Vector<T> coefficients, T intercept)
Parameters
coefficientsVector<T>The coefficient vector to set.
interceptTThe intercept value to set.
SetParameters(Vector<T>)
Sets the parameters for this model.
public virtual void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>A vector containing all model parameters (coefficients and intercept).
Remarks
This method updates the model's parameters in-place. The parameters vector should contain coefficients followed by the intercept (if the model uses one).
For Beginners: This method updates the model's parameters directly.
Unlike WithParameters() which creates a new model, this method modifies the current model. The parameters include the coefficients (how much each feature affects the prediction) and the intercept (the baseline value).
Exceptions
- ArgumentException
Thrown when the parameters vector has an incorrect length.
SolveSystem(Matrix<T>, Vector<T>)
Solves a linear system of equations using the specified decomposition method.
protected Vector<T> SolveSystem(Matrix<T> a, Vector<T> b)
Parameters
aMatrix<T>The coefficient matrix.
bVector<T>The right-hand side vector.
Returns
- Vector<T>
The solution vector.
Remarks
This method solves the linear system Ax = b using either the specified decomposition method or the normal equation as a fallback.
For Beginners: Many regression problems involve solving a system of linear equations. This method provides a way to solve such systems using various mathematical techniques. The choice of technique can affect the accuracy and efficiency of the solution, especially for large or ill-conditioned systems.
Train(Matrix<T>, Vector<T>)
Trains the regression model on the provided data.
public abstract void Train(Matrix<T> x, Vector<T> y)
Parameters
xMatrix<T>The input features matrix where each row is a training example and each column is a feature.
yVector<T>The target values vector corresponding to each training example.
Remarks
This abstract method must be implemented by derived classes to train the regression model.
For Beginners: Training is the process where the model learns from your data. Different regression algorithms implement this method differently, but they all aim to find the best coefficients (weights) that minimize the prediction error on the training data.
WithParameters(Vector<T>)
Creates a new instance of the model with specified parameters.
public virtual IFullModel<T, Matrix<T>, Vector<T>> WithParameters(Vector<T> parameters)
Parameters
parametersVector<T>A vector containing all model parameters (coefficients and intercept).
Returns
- IFullModel<T, Matrix<T>, Vector<T>>
A new model instance with the specified parameters.
Remarks
This method creates a new model with the same options but different parameter values. The parameters vector should contain coefficients followed by the intercept (if the model uses one).
For Beginners: This method creates a new model using a specific set of parameters.
It's like creating a new recipe based on an existing one, but with different ingredient amounts. You provide all the parameters (coefficients and intercept) in a single collection, and the method:
- Creates a new model
- Sets its parameters to the values you provided
- Returns this new model ready to use for predictions
This is useful for:
- Testing how different parameter values affect predictions
- Using optimization algorithms that try different parameter sets
- Creating ensemble models that combine multiple parameter variations
Exceptions
- ArgumentException
Thrown when the parameters vector has an incorrect length.