Table of Contents

Class MiniBatchSparsePCA<T>

Namespace
AiDotNet.Preprocessing.DimensionalityReduction
Assembly
AiDotNet.dll

Mini-batch Sparse PCA using online dictionary learning.

public class MiniBatchSparsePCA<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>

Type Parameters

T

The numeric type for calculations (e.g., float, double).

Inheritance
TransformerBase<T, Matrix<T>, Matrix<T>>
MiniBatchSparsePCA<T>
Implements
IDataTransformer<T, Matrix<T>, Matrix<T>>
Inherited Members

Remarks

MiniBatchSparsePCA is a faster, memory-efficient version of SparsePCA that processes data in mini-batches instead of using the full dataset. This makes it suitable for large datasets that don't fit in memory.

The algorithm uses online dictionary learning with mini-batches, updating the components incrementally as it processes each batch.

For Beginners: Think of this as SparsePCA on a budget: - SparsePCA looks at ALL your data at once (memory intensive) - MiniBatchSparsePCA looks at small pieces at a time (memory efficient) - Results are similar, but mini-batch is faster for large datasets - Trade-off: Slightly less accurate but much more scalable

Constructors

MiniBatchSparsePCA(int, double, double, int, int, double, bool, int?, int[]?)

Creates a new instance of MiniBatchSparsePCA<T>.

public MiniBatchSparsePCA(int nComponents = 2, double alpha = 1, double ridge = 0.01, int batchSize = 50, int nIter = 100, double tol = 1E-06, bool shuffle = true, int? randomState = null, int[]? columnIndices = null)

Parameters

nComponents int

Number of sparse components. Defaults to 2.

alpha double

Sparsity regularization parameter. Defaults to 1.0.

ridge double

Ridge regularization for stability. Defaults to 0.01.

batchSize int

Size of mini-batches. Defaults to 50.

nIter int

Number of iterations over the full dataset. Defaults to 100.

tol double

Convergence tolerance. Defaults to 1e-6.

shuffle bool

Whether to shuffle data before each iteration. Defaults to true.

randomState int?

Random seed for reproducibility.

columnIndices int[]

The column indices to use, or null for all columns.

Properties

Alpha

Gets the sparsity regularization parameter.

public double Alpha { get; }

Property Value

double

BatchSize

Gets the batch size.

public int BatchSize { get; }

Property Value

int

Components

Gets the sparse components (each row is a component).

public double[,]? Components { get; }

Property Value

double[,]

Mean

Gets the mean of each feature.

public double[]? Mean { get; }

Property Value

double[]

NComponents

Gets the number of components.

public int NComponents { get; }

Property Value

int

NSamplesSeen

Gets the number of samples seen during fitting.

public int NSamplesSeen { get; }

Property Value

int

Ridge

Gets the ridge regularization parameter.

public double Ridge { get; }

Property Value

double

SupportsInverseTransform

Gets whether this transformer supports inverse transformation.

public override bool SupportsInverseTransform { get; }

Property Value

bool

Methods

FitCore(Matrix<T>)

Fits Mini-batch Sparse PCA using online dictionary learning.

protected override void FitCore(Matrix<T> data)

Parameters

data Matrix<T>

GetFeatureNamesOut(string[]?)

Gets the output feature names after transformation.

public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)

Parameters

inputFeatureNames string[]

Returns

string[]

InverseTransformCore(Matrix<T>)

Transforms data back to original space.

protected override Matrix<T> InverseTransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>

TransformCore(Matrix<T>)

Transforms the data by projecting onto sparse components.

protected override Matrix<T> TransformCore(Matrix<T> data)

Parameters

data Matrix<T>

Returns

Matrix<T>