Class MiniBatchSparsePCA<T>
- Namespace
- AiDotNet.Preprocessing.DimensionalityReduction
- Assembly
- AiDotNet.dll
Mini-batch Sparse PCA using online dictionary learning.
public class MiniBatchSparsePCA<T> : TransformerBase<T, Matrix<T>, Matrix<T>>, IDataTransformer<T, Matrix<T>, Matrix<T>>
Type Parameters
TThe numeric type for calculations (e.g., float, double).
- Inheritance
-
MiniBatchSparsePCA<T>
- Implements
- Inherited Members
Remarks
MiniBatchSparsePCA is a faster, memory-efficient version of SparsePCA that processes data in mini-batches instead of using the full dataset. This makes it suitable for large datasets that don't fit in memory.
The algorithm uses online dictionary learning with mini-batches, updating the components incrementally as it processes each batch.
For Beginners: Think of this as SparsePCA on a budget: - SparsePCA looks at ALL your data at once (memory intensive) - MiniBatchSparsePCA looks at small pieces at a time (memory efficient) - Results are similar, but mini-batch is faster for large datasets - Trade-off: Slightly less accurate but much more scalable
Constructors
MiniBatchSparsePCA(int, double, double, int, int, double, bool, int?, int[]?)
Creates a new instance of MiniBatchSparsePCA<T>.
public MiniBatchSparsePCA(int nComponents = 2, double alpha = 1, double ridge = 0.01, int batchSize = 50, int nIter = 100, double tol = 1E-06, bool shuffle = true, int? randomState = null, int[]? columnIndices = null)
Parameters
nComponentsintNumber of sparse components. Defaults to 2.
alphadoubleSparsity regularization parameter. Defaults to 1.0.
ridgedoubleRidge regularization for stability. Defaults to 0.01.
batchSizeintSize of mini-batches. Defaults to 50.
nIterintNumber of iterations over the full dataset. Defaults to 100.
toldoubleConvergence tolerance. Defaults to 1e-6.
shuffleboolWhether to shuffle data before each iteration. Defaults to true.
randomStateint?Random seed for reproducibility.
columnIndicesint[]The column indices to use, or null for all columns.
Properties
Alpha
Gets the sparsity regularization parameter.
public double Alpha { get; }
Property Value
BatchSize
Gets the batch size.
public int BatchSize { get; }
Property Value
Components
Gets the sparse components (each row is a component).
public double[,]? Components { get; }
Property Value
- double[,]
Mean
Gets the mean of each feature.
public double[]? Mean { get; }
Property Value
- double[]
NComponents
Gets the number of components.
public int NComponents { get; }
Property Value
NSamplesSeen
Gets the number of samples seen during fitting.
public int NSamplesSeen { get; }
Property Value
Ridge
Gets the ridge regularization parameter.
public double Ridge { get; }
Property Value
SupportsInverseTransform
Gets whether this transformer supports inverse transformation.
public override bool SupportsInverseTransform { get; }
Property Value
Methods
FitCore(Matrix<T>)
Fits Mini-batch Sparse PCA using online dictionary learning.
protected override void FitCore(Matrix<T> data)
Parameters
dataMatrix<T>
GetFeatureNamesOut(string[]?)
Gets the output feature names after transformation.
public override string[] GetFeatureNamesOut(string[]? inputFeatureNames = null)
Parameters
inputFeatureNamesstring[]
Returns
- string[]
InverseTransformCore(Matrix<T>)
Transforms data back to original space.
protected override Matrix<T> InverseTransformCore(Matrix<T> data)
Parameters
dataMatrix<T>
Returns
- Matrix<T>
TransformCore(Matrix<T>)
Transforms the data by projecting onto sparse components.
protected override Matrix<T> TransformCore(Matrix<T> data)
Parameters
dataMatrix<T>
Returns
- Matrix<T>