Class ARIMAModel<T>
- Namespace
- AiDotNet.TimeSeries
- Assembly
- AiDotNet.dll
Implements an ARIMA (AutoRegressive Integrated Moving Average) model for time series forecasting.
public class ARIMAModel<T> : TimeSeriesModelBase<T>, ITimeSeriesModel<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>
Type Parameters
TThe numeric type used for calculations (e.g., float, double, decimal).
- Inheritance
-
ARIMAModel<T>
- Implements
- Inherited Members
- Extension Methods
Remarks
ARIMA models are widely used for time series forecasting. The model combines three components: - AR (AutoRegressive): Uses the dependent relationship between an observation and a number of lagged observations - I (Integrated): Uses differencing of observations to make the time series stationary - MA (Moving Average): Uses the dependency between an observation and residual errors from a moving average model
For Beginners: ARIMA is a popular technique for analyzing and forecasting time series data (data collected over time, like stock prices, temperature readings, or monthly sales figures).
Think of ARIMA as combining three different approaches:
- AutoRegressive (AR): Looks at past values to predict future values. For example, today's temperature might be related to yesterday's temperature.
- Integrated (I): Transforms the data to make it easier to analyze by removing trends. For example, instead of looking at temperatures directly, we might look at how they change from day to day.
- Moving Average (MA): Looks at past prediction errors to improve future predictions. For example, if we consistently underestimate temperature, we can adjust for that.
The model has three key parameters (p, d, q):
- p: How many past values to look at (AR component)
- d: How many times to difference the data (I component)
- q: How many past prediction errors to consider (MA component)
Constructors
ARIMAModel(ARIMAOptions<T>?)
Creates a new ARIMA model with the specified options.
public ARIMAModel(ARIMAOptions<T>? options = null)
Parameters
optionsARIMAOptions<T>Options for the ARIMA model, including p, d, and q parameters. If null, default options are used.
Remarks
For Beginners: This constructor creates a new ARIMA model. You can customize the model by providing options:
- p: How many past values to consider (AR order)
- d: How many times to difference the data to remove trends
- q: How many past prediction errors to consider (MA order)
If you don't provide options, default values will be used, but it's usually best to choose values that make sense for your specific data.
Methods
ComputeAnomalyScores(Vector<T>)
Computes anomaly scores for each point in a time series.
public Vector<T> ComputeAnomalyScores(Vector<T> timeSeries)
Parameters
timeSeriesVector<T>The time series data to analyze.
Returns
- Vector<T>
A vector of anomaly scores (absolute prediction errors) for each point.
Remarks
The anomaly score is the absolute difference between the actual value and the predicted value. Higher scores indicate more anomalous points. The first few points (up to the lag order) will have a score of zero since there isn't enough history to make predictions.
For Beginners: Instead of just saying "anomaly or not", this method tells you exactly how unusual each point is. A score of 0 means the value matches the prediction perfectly. Higher scores mean the value was more unexpected.
You can use these scores to:
- Rank anomalies by severity (higher score = more unusual)
- Set your own custom threshold
- Visualize the anomaly intensity over time
Exceptions
- InvalidOperationException
Thrown when the model hasn't been trained yet.
CreateInstance()
Creates a new instance of the ARIMA model with the same options.
protected override IFullModel<T, Matrix<T>, Vector<T>> CreateInstance()
Returns
- IFullModel<T, Matrix<T>, Vector<T>>
A new instance of the ARIMA model.
Remarks
For Beginners: This method creates a fresh copy of the model with the same settings.
The new copy:
- Has the same p, d, and q parameters as the original model
- Has the same configuration options
- Is untrained (doesn't have coefficients yet)
This is useful when you want to:
- Train multiple versions of the same model on different data
- Create ensemble models that combine predictions from multiple similar models
- Reset a model to start fresh while keeping the same structure
DeserializeCore(BinaryReader)
Deserializes the model's state from a binary stream.
protected override void DeserializeCore(BinaryReader reader)
Parameters
readerBinaryReaderThe binary reader to read from.
Remarks
For Beginners: This private method loads a previously saved model from a file or stream.
Deserialization allows you to:
- Load a previously trained model
- Use it immediately without retraining
- Apply the exact same model to new data
The method loads all the parameters that were saved during serialization: the p, d, q values, the constant term, and the AR and MA coefficients.
DetectAnomalies(Vector<T>)
Detects anomalies in a time series by comparing predictions to actual values.
public bool[] DetectAnomalies(Vector<T> timeSeries)
Parameters
timeSeriesVector<T>The time series data to analyze for anomalies.
Returns
- bool[]
A boolean array where true indicates an anomaly at that position.
Remarks
This method uses the ARIMA model to predict each point in the time series based on previous values, then flags points where the prediction error exceeds the anomaly threshold computed during training.
For Beginners: This method goes through your time series and identifies points that are "unusual" compared to what the model would expect. A point is considered an anomaly if the difference between the actual value and the predicted value is larger than the threshold learned during training.
Example use case: If you have daily sales data, this method can identify days where sales were abnormally high or low compared to the typical pattern.
Exceptions
- InvalidOperationException
Thrown when the model hasn't been trained yet or when anomaly detection wasn't enabled during training.
DetectAnomaliesDetailed(Vector<T>)
Detects anomalies and returns detailed information about each detected anomaly.
public List<(int Index, T Actual, T Predicted, T Score)> DetectAnomaliesDetailed(Vector<T> timeSeries)
Parameters
timeSeriesVector<T>The time series data to analyze.
Returns
- List<(int Index, T Actual, T Predicted, T Score)>
A list of tuples containing (index, actual value, predicted value, score) for each anomaly.
Remarks
For Beginners: This method not only tells you which points are anomalies, but also provides additional context: - Index: The position of the anomaly in the time series - Actual: What the value actually was - Predicted: What the model expected the value to be - Score: How far off the prediction was
This extra information helps you understand why each point was flagged as an anomaly.
EvaluateModel(Matrix<T>, Vector<T>)
Evaluates the model's performance on test data.
public override Dictionary<string, T> EvaluateModel(Matrix<T> xTest, Vector<T> yTest)
Parameters
xTestMatrix<T>Feature matrix for testing.
yTestVector<T>Actual target values for testing.
Returns
- Dictionary<string, T>
A dictionary of evaluation metrics (MSE, RMSE, MAE).
Remarks
For Beginners: This method measures how well the model performs by comparing its predictions against actual values from a test dataset.
It calculates several common error metrics:
- MSE (Mean Squared Error): The average of squared differences between predictions and actual values
- RMSE (Root Mean Squared Error): The square root of MSE, which is in the same units as the original data
- MAE (Mean Absolute Error): The average of absolute differences between predictions and actual values
Lower values for all these metrics indicate better performance.
GetAnomalyThreshold()
Gets the current anomaly detection threshold.
public T GetAnomalyThreshold()
Returns
- T
The anomaly threshold computed during training.
Remarks
For Beginners: This tells you the current cutoff value used to decide whether a prediction error is large enough to be an anomaly. Values above this threshold are considered anomalies.
GetModelMetadata()
Gets metadata about the model, including its type, parameters, and configuration.
public override ModelMetadata<T> GetModelMetadata()
Returns
- ModelMetadata<T>
A ModelMetaData object containing information about the model.
Remarks
For Beginners: This method provides a summary of your model's settings and what it has learned.
The metadata includes:
- The type of model (ARIMA)
- The p, d, and q parameters that define the model structure
- The AR and MA coefficients that were learned during training
- The constant term that serves as the baseline prediction
This information is useful for:
- Documenting your model for future reference
- Comparing different models to see which performs best
- Understanding what patterns the model has identified in your data
Predict(Matrix<T>)
Makes predictions using the trained ARIMA model.
public override Vector<T> Predict(Matrix<T> input)
Parameters
inputMatrix<T>Input matrix for prediction (typically just time indices for future periods).
Returns
- Vector<T>
A vector of predicted values.
Remarks
For Beginners: This method uses the trained ARIMA model to forecast future values.
The prediction process:
- Starts with the constant term as a base value
- Adds the effects of past observations (AR component)
- Adds the effects of past prediction errors (MA component)
- For each prediction, updates the history used for the next prediction
Note: For pure time series forecasting, the input parameter might just indicate how many future periods to predict.
PredictSingle(Vector<T>)
Predicts a single value based on the input vector.
public override T PredictSingle(Vector<T> input)
Parameters
inputVector<T>Input vector containing features for prediction.
Returns
- T
The predicted value.
Remarks
For Beginners: This method generates a single prediction based on your input data.
The prediction process:
- Starts with the constant term as the baseline value
- Adds the influence of past observations (AR component)
- Adds the influence of past prediction errors (MA component)
This is useful when you need just one prediction rather than a whole series. For example, if you want to predict tomorrow's temperature specifically, rather than temperatures for the next week.
SerializeCore(BinaryWriter)
Serializes the model's state to a binary stream.
protected override void SerializeCore(BinaryWriter writer)
Parameters
writerBinaryWriterThe binary writer to write to.
Remarks
For Beginners: This private method saves the model's internal state to a file or stream.
Serialization allows you to:
- Save a trained model to disk
- Load it later without having to retrain
- Share the model with others
The method saves all the essential parameters: the p, d, q values, the constant term, and the AR and MA coefficients.
SetAnomalyThreshold(T)
Sets a custom anomaly detection threshold.
public void SetAnomalyThreshold(T threshold)
Parameters
thresholdTThe new threshold value.
Remarks
For Beginners: If the automatic threshold is flagging too many or too few anomalies, you can set your own. A higher threshold means fewer anomalies will be detected (only more extreme values). A lower threshold means more anomalies will be detected.
TrainCore(Matrix<T>, Vector<T>)
Core implementation of the training logic for the ARIMA model.
protected override void TrainCore(Matrix<T> x, Vector<T> y)
Parameters
xMatrix<T>Feature matrix (typically just time indices for ARIMA models).
yVector<T>Target vector (the time series values to be modeled).
Remarks
For Beginners: This method contains the core implementation of the training process. It:
- Differences the data to remove trends (the "I" in ARIMA)
- Estimates the AR coefficients that capture how past values affect future values
- Calculates residuals and uses them to estimate the MA coefficients
- Estimates the constant term that serves as the baseline prediction
This implementation follows the same process as the public Train method but provides the actual mechanism that fits the model to your data.