Class TimeSeriesIsolationForest<T>
- Namespace
- AiDotNet.TimeSeries.AnomalyDetection
- Assembly
- AiDotNet.dll
Implements Isolation Forest for time series anomaly detection.
public class TimeSeriesIsolationForest<T> : TimeSeriesModelBase<T>, ITimeSeriesModel<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
- Inheritance
-
TimeSeriesIsolationForest<T>
- Implements
- Inherited Members
- Extension Methods
Remarks
The Time Series Anomaly Detection Challenge: Traditional anomaly detection treats each data point independently. For time series, we need to consider temporal context - a value might be normal on its own but anomalous given what came before or the time of day.
How Time Series Isolation Forest Works: 1. **Feature Engineering**: Transform raw time series into feature vectors including: - Lag features (past values) - Rolling statistics (mean, std, min, max over recent windows) - Trend indicators (derivative, acceleration) - Seasonal residuals (deviation from expected seasonal pattern)
Isolation Forest: For each feature vector:
- Randomly select a feature and split value
- Recursively partition until isolated
- Count path length to isolation
- Anomalies have shorter paths (easier to isolate)
Anomaly Scoring: Compute anomaly score from average path length across all trees
For Beginners: Imagine you're trying to describe where someone lives. For most people, you need many questions: "Which continent? Which country? Which city?..." But if someone lives on a tiny island, you can identify them quickly: "Do you live on that island? Yes."
Isolation Forest uses this idea: anomalies are "easy to describe" (short paths), while normal points need more questions to distinguish them.
For time series, we add context: "Is this value unusual compared to yesterday? Is it unusual for this time of day? Is it unusual given the recent trend?"
Constructors
TimeSeriesIsolationForest(TimeSeriesIsolationForestOptions<T>?)
Initializes a new instance of the Time Series Isolation Forest.
public TimeSeriesIsolationForest(TimeSeriesIsolationForestOptions<T>? options = null)
Parameters
optionsTimeSeriesIsolationForestOptions<T>Configuration options. Uses defaults if null.
Methods
CreateInstance()
Creates a new instance of the derived model class.
protected override IFullModel<T, Matrix<T>, Vector<T>> CreateInstance()
Returns
- IFullModel<T, Matrix<T>, Vector<T>>
A new instance of the same model type.
Remarks
This abstract factory method must be implemented by derived classes to create a new instance of their specific type. It's used by Clone and DeepCopy to ensure that the correct derived type is instantiated.
For Beginners: This method creates a new, empty instance of the specific model type. It's used during cloning and deep copying to ensure that the copy is of the same specific type as the original.
For example, if the original model is an ARIMA model, this method would create a new ARIMA model. If it's a TBATS model, it would create a new TBATS model.
DeserializeCore(BinaryReader)
Deserializes model-specific data from the binary reader.
protected override void DeserializeCore(BinaryReader reader)
Parameters
readerBinaryReaderThe binary reader to read from.
Remarks
This abstract method must be implemented by each specific model type to load its unique parameters and state.
For Beginners: This method is responsible for loading the specific details that make each type of time series model unique. It reads exactly what was written by SerializeCore, in the same order, reconstructing the specialized parts of the model.
It's the counterpart to SerializeCore and should read data in exactly the same order and format that it was written.
This separation allows the base class to handle common deserialization tasks while each model type handles its specialized data.
DetectAnomalies(Vector<T>)
Detects anomalies in the time series and returns anomaly scores.
public Vector<T> DetectAnomalies(Vector<T> timeSeries)
Parameters
timeSeriesVector<T>The time series to analyze.
Returns
- Vector<T>
Anomaly scores for each point (higher = more anomalous).
GetAnomalyIndices(Vector<T>)
Gets the indices of detected anomalies.
public List<int> GetAnomalyIndices(Vector<T> timeSeries)
Parameters
timeSeriesVector<T>The time series to analyze.
Returns
GetAnomalyLabels(Vector<T>)
Returns binary anomaly labels (true = anomaly).
public bool[] GetAnomalyLabels(Vector<T> timeSeries)
Parameters
timeSeriesVector<T>The time series to analyze.
Returns
- bool[]
Boolean vector indicating which points are anomalies.
GetModelMetadata()
Gets metadata about the time series model.
public override ModelMetadata<T> GetModelMetadata()
Returns
- ModelMetadata<T>
A ModelMetaData object containing information about the model.
Remarks
This method provides comprehensive metadata about the model, including its type, configuration options, training status, evaluation metrics, and information about which features/lags are most important.
For Beginners: This method provides important information about the model that can help you understand its characteristics and performance.
The metadata includes:
- The type of model (e.g., ARIMA, TBATS, Neural Network)
- Configuration details (e.g., lag order, seasonality period)
- Whether the model has been trained
- Performance metrics from the last evaluation
- Information about which features (time periods) are most influential
This information is useful for documentation, model comparison, and debugging. It's like a complete summary of everything important about the model.
PredictSingle(Vector<T>)
Generates a prediction for a single input vector.
public override T PredictSingle(Vector<T> input)
Parameters
inputVector<T>The input feature vector.
Returns
- T
The predicted value.
Remarks
This abstract method must be implemented by derived classes to generate a prediction for a single input vector using the model-specific algorithm.
For Beginners: This method takes a single row of input data (representing one time point) and calculates what the model predicts will happen at that point. Each type of time series model will have its own way of calculating this prediction based on the patterns it learned during training.
SerializeCore(BinaryWriter)
Serializes model-specific data to the binary writer.
protected override void SerializeCore(BinaryWriter writer)
Parameters
writerBinaryWriterThe binary writer to write to.
Remarks
This abstract method must be implemented by each specific model type to save its unique parameters and state.
For Beginners: This method is responsible for saving the specific details that make each type of time series model unique. Different models have different internal structures and parameters that need to be saved separately from the common elements.
For example:
- An ARIMA model would save its AR, I, and MA coefficients
- A TBATS model would save its level, trend, and seasonal components
- A neural network model would save its weights and biases
This separation allows the base class to handle common serialization tasks while each model type handles its specialized data.
TrainCore(Matrix<T>, Vector<T>)
Trains the isolation forest on the time series data.
protected override void TrainCore(Matrix<T> x, Vector<T> y)
Parameters
xMatrix<T>yVector<T>