Class IncrementalPCA<T>
- Namespace
- AiDotNet.Preprocessing.DimensionalityReduction
- Assembly
- AiDotNet.dll
Incremental Principal Component Analysis for large datasets.
public class IncrementalPCA<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>
Type Parameters
TThe numeric type for calculations (e.g., float, double).
- Inheritance
-
IncrementalPCA<T>
- Implements
- Inherited Members
Remarks
IncrementalPCA processes data in batches, making it suitable for datasets too large to fit in memory. It produces similar results to standard PCA but with lower memory requirements.
The algorithm updates the covariance matrix incrementally as each batch is processed, then computes principal components from the final estimate.
For Beginners: Regular PCA needs all data in memory at once. Incremental PCA processes data in chunks: - Feed data in batches (e.g., 1000 rows at a time) - Updates its understanding of the data with each batch - Produces similar principal components as regular PCA - Uses much less memory for large datasets
Constructors
IncrementalPCA(int, int, bool, int[]?)
Creates a new instance of IncrementalPCA<T>.
public IncrementalPCA(int nComponents = 2, int batchSize = 100, bool whiten = false, int[]? columnIndices = null)
Parameters
nComponentsintNumber of components to keep. Defaults to 2.
batchSizeintBatch size for incremental updates. Defaults to 100.
whitenboolIf true, scale components to unit variance. Defaults to false.
columnIndicesint[]The column indices to use, or null for all columns.
Properties
BatchSize
Gets the batch size for incremental updates.
public int BatchSize { get; }
Property Value
Components
Gets the principal components.
public double[,]? Components { get; }
Property Value
- double[,]
ExplainedVariance
Gets the explained variance for each component.
public double[]? ExplainedVariance { get; }
Property Value
- double[]
ExplainedVarianceRatio
Gets the explained variance ratio for each component.
public double[]? ExplainedVarianceRatio { get; }
Property Value
- double[]
Mean
Gets the mean of each feature.
public double[]? Mean { get; }
Property Value
- double[]
NComponents
Gets the number of components to keep.
public int NComponents { get; }
Property Value
NSamplesSeen
Gets the number of samples seen during fitting.
public int NSamplesSeen { get; }
Property Value
SupportsInverseTransform
Gets whether this transformer supports inverse transformation.
public override bool SupportsInverseTransform { get; }
Property Value
Whiten
Gets whether whitening is applied.
public bool Whiten { get; }
Property Value
Methods
FitCore(Matrix<T>)
Fits IncrementalPCA by processing data in batches.
protected override void FitCore(Matrix<T> data)
Parameters
dataMatrix<T>The training data matrix.
GetFeatureNamesOut(string[]?)
Gets the output feature names after transformation.
public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)
Parameters
inputFeatureNamesstring[]
Returns
- string[]
InverseTransformCore(Matrix<T>)
Transforms data back to original space.
protected override Matrix<T> InverseTransformCore(Matrix<T> data)
Parameters
dataMatrix<T>The transformed data.
Returns
- Matrix<T>
Data in original feature space.
PartialFit(Matrix<T>)
Partially fits the model with a new batch of data.
public void PartialFit(Matrix<T> batch)
Parameters
batchMatrix<T>A batch of training data.
TransformCore(Matrix<T>)
Transforms the data by projecting onto principal components.
protected override Matrix<T> TransformCore(Matrix<T> data)
Parameters
dataMatrix<T>The data to transform.
Returns
- Matrix<T>
The transformed data.