Table of Contents

Class IncrementalPCA<T>

Namespace
AiDotNet.Preprocessing.DimensionalityReduction
Assembly
AiDotNet.dll

Incremental Principal Component Analysis for large datasets.

public class IncrementalPCA<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>

Type Parameters

T

The numeric type for calculations (e.g., float, double).

Inheritance
TransformerBase<T, Matrix<T>, Matrix<T>>
IncrementalPCA<T>
Implements
IDataTransformer<T, Matrix<T>, Matrix<T>>
Inherited Members

Remarks

IncrementalPCA processes data in batches, making it suitable for datasets too large to fit in memory. It produces similar results to standard PCA but with lower memory requirements.

The algorithm updates the covariance matrix incrementally as each batch is processed, then computes principal components from the final estimate.

For Beginners: Regular PCA needs all data in memory at once. Incremental PCA processes data in chunks: - Feed data in batches (e.g., 1000 rows at a time) - Updates its understanding of the data with each batch - Produces similar principal components as regular PCA - Uses much less memory for large datasets

Constructors

IncrementalPCA(int, int, bool, int[]?)

Creates a new instance of IncrementalPCA<T>.

public IncrementalPCA(int nComponents = 2, int batchSize = 100, bool whiten = false, int[]? columnIndices = null)

Parameters

nComponents int

Number of components to keep. Defaults to 2.

batchSize int

Batch size for incremental updates. Defaults to 100.

whiten bool

If true, scale components to unit variance. Defaults to false.

columnIndices int[]

The column indices to use, or null for all columns.

Properties

BatchSize

Gets the batch size for incremental updates.

public int BatchSize { get; }

Property Value

int

Components

Gets the principal components.

public double[,]? Components { get; }

Property Value

double[,]

ExplainedVariance

Gets the explained variance for each component.

public double[]? ExplainedVariance { get; }

Property Value

double[]

ExplainedVarianceRatio

Gets the explained variance ratio for each component.

public double[]? ExplainedVarianceRatio { get; }

Property Value

double[]

Mean

Gets the mean of each feature.

public double[]? Mean { get; }

Property Value

double[]

NComponents

Gets the number of components to keep.

public int NComponents { get; }

Property Value

int

NSamplesSeen

Gets the number of samples seen during fitting.

public int NSamplesSeen { get; }

Property Value

int

SupportsInverseTransform

Gets whether this transformer supports inverse transformation.

public override bool SupportsInverseTransform { get; }

Property Value

bool

Whiten

Gets whether whitening is applied.

public bool Whiten { get; }

Property Value

bool

Methods

FitCore(Matrix<T>)

Fits IncrementalPCA by processing data in batches.

protected override void FitCore(Matrix<T> data)

Parameters

data Matrix<T>

The training data matrix.

GetFeatureNamesOut(string[]?)

Gets the output feature names after transformation.

public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)

Parameters

inputFeatureNames string[]

Returns

string[]

InverseTransformCore(Matrix<T>)

Transforms data back to original space.

protected override Matrix<T> InverseTransformCore(Matrix<T> data)

Parameters

data Matrix<T>

The transformed data.

Returns

Matrix<T>

Data in original feature space.

PartialFit(Matrix<T>)

Partially fits the model with a new batch of data.

public void PartialFit(Matrix<T> batch)

Parameters

batch Matrix<T>

A batch of training data.

TransformCore(Matrix<T>)

Transforms the data by projecting onto principal components.

protected override Matrix<T> TransformCore(Matrix<T> data)

Parameters

data Matrix<T>

The data to transform.

Returns

Matrix<T>

The transformed data.