Class SimpleRegression<T>
- Namespace
- AiDotNet.Regression
- Assembly
- AiDotNet.dll
Implements simple linear regression, which predicts a single output value based on a single input feature. This is the most basic form of regression that finds the best-fitting straight line through a set of points.
public class SimpleRegression<T> : RegressionBase<T>, IRegression<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>
Type Parameters
TThe numeric type used for calculations (typically float or double).
- Inheritance
-
SimpleRegression<T>
- Implements
-
IRegression<T>
- Inherited Members
- Extension Methods
Remarks
Simple linear regression models the relationship between two variables by fitting a linear equation: y = mx + b where: - y is the predicted output value - x is the input feature value - m is the slope (coefficient) - b is the y-intercept (where the line crosses the y-axis)
For Beginners: Simple linear regression is like drawing the best straight line through a set of points.
Think of it like this:
- You have data points on a graph (like house sizes and their prices)
- You want to find the line that best represents the relationship
- This line helps you predict new values (like the price of a house based on its size)
For example, if you plot people's heights and weights, simple regression would find the line that shows how weight typically increases with height, allowing you to estimate someone's weight if you only know their height.
Constructors
SimpleRegression(RegressionOptions<T>?, IRegularization<T, Matrix<T>, Vector<T>>?)
Creates a new simple regression model.
public SimpleRegression(RegressionOptions<T>? options = null, IRegularization<T, Matrix<T>, Vector<T>>? regularization = null)
Parameters
optionsRegressionOptions<T>Optional configuration settings for the regression model. These settings control aspects like:
- Whether to include an intercept term (the "b" in y = mx + b)
- How to handle numerical precision If not provided, default options will be used.
regularizationIRegularization<T, Matrix<T>, Vector<T>>Optional regularization method to prevent overfitting. Regularization is a technique that helps the model perform better on new, unseen data by preventing it from fitting the training data too closely. If not provided, no regularization will be applied.
Remarks
This constructor creates a new simple regression model with the specified configuration options and regularization method. If options are not provided, default values are used. Regularization helps prevent overfitting by adding penalties for model complexity.
For Beginners: This method sets up a new simple regression model.
Think of it like setting up a new tool:
- You can use the default settings (by not specifying options)
- Or you can customize how it works (by providing specific options)
- You can also add regularization, which acts like a safeguard to prevent the model from memorizing the data instead of learning the general pattern
For example, when setting up a simple regression to predict house prices based on size, you might want to include an intercept (base price) or use regularization if you have limited data samples.
Methods
CreateNewInstance()
Creates a new instance of the simple regression model with the same options.
protected override IFullModel<T, Matrix<T>, Vector<T>> CreateNewInstance()
Returns
- IFullModel<T, Matrix<T>, Vector<T>>
A new instance of the simple regression model with the same configuration but no trained parameters.
Remarks
This method creates a new instance of the simple regression model with the same configuration options and regularization method as the current instance, but without copying the trained parameters.
For Beginners: This method creates a fresh copy of the model configuration without any learned parameters.
Think of it like getting a clean notepad with the same paper type and line spacing, but without any writing on it yet. The new model has the same settings (like whether to include an intercept term), but hasn't learned any coefficients from data.
This is primarily used internally by the framework when doing things like:
- Cross-validation (testing the model on different data splits)
- Building model ensembles
- Creating copies of models for experimentation
GetModelType()
Returns the type identifier for this regression model.
protected override ModelType GetModelType()
Returns
- ModelType
The model type identifier for simple regression.
Remarks
This method is used internally for model identification and serialization purposes. It returns an enum value that identifies this model as a simple regression model.
For Beginners: This method simply tells the system what kind of model this is.
It's like a name tag for the model that says "I am a simple regression model." This is useful when:
- Saving the model to a file
- Loading a model from a file
- Logging information about the model
You generally won't need to call this method directly in your code.
Train(Matrix<T>, Vector<T>)
Trains the simple regression model using the provided input feature and target values.
public override void Train(Matrix<T> x, Vector<T> y)
Parameters
xMatrix<T>The input feature matrix, which must have exactly one column. Each row represents one data sample. For example, if predicting house prices based on square footage, this would be a single column of square footage values.
yVector<T>The target values vector, containing the actual output values that the model should learn to predict. Each element corresponds to a row in the input matrix.
Remarks
This method finds the best-fitting line by minimizing the sum of squared differences between the predicted values and the actual values in the training data. It computes the coefficient (slope) and intercept values that define the regression line.
For Beginners: This method teaches the model to make predictions based on your data.
The training process works like this:
- The model looks at all your data points (like house sizes and prices)
- It tries different straight lines to see which one fits the points best
- "Best fit" means the line that has the smallest total distance from all points
- Once found, this line gives you the formula to predict new values
For example, after training on house data, the model might learn that: price = $100,000 + ($100 × square_footage) This means a house has a base price of $100,000 plus $100 for each square foot.
Exceptions
- ArgumentException
Thrown when the input matrix has more than one column. Simple regression only works with a single input feature.