Class LowRankFactorizationCompression<T>
- Namespace
- AiDotNet.ModelCompression
- Assembly
- AiDotNet.dll
Implements Low-Rank Factorization compression using SVD-like decomposition.
public class LowRankFactorizationCompression<T> : ModelCompressionBase<T>, IModelCompressionStrategy<T>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
- Inheritance
-
LowRankFactorizationCompression<T>
- Implements
- Inherited Members
Remarks
Low-Rank Factorization approximates weight matrices by decomposing them into products of smaller matrices. This is based on the observation that many neural network weight matrices are approximately low-rank, meaning they can be represented with fewer parameters.
For Beginners: Low-Rank Factorization is like summarizing a book.
The concept:
- A weight matrix might be 1000×1000 = 1,000,000 parameters
- But the actual "information content" might be much smaller
- We can approximate it as: W ≈ A × B where A is 1000×50 and B is 50×1000
- Now we only store: 50,000 + 50,000 = 100,000 parameters (10x compression!)
How it works:
- Treat the weight vector as a matrix (reshape it)
- Perform approximate factorization (similar to SVD)
- Keep only the top-k singular values/vectors
- Store the factored matrices instead of the original
Benefits:
- Compression ratio is controlled by the rank k
- Works especially well for fully-connected layers
- Maintains smoothness in the weight space
Trade-offs:
- Need to choose the rank k (compression vs accuracy trade-off)
- Works best when weights have inherent low-rank structure
Constructors
LowRankFactorizationCompression(int, double, int, double)
Initializes a new instance of the LowRankFactorizationCompression class.
public LowRankFactorizationCompression(int targetRank = 0, double energyThreshold = 0.95, int maxIterations = 100, double tolerance = 1E-06)
Parameters
targetRankintTarget rank for the factorization (default: 0 = auto based on energy).
energyThresholddoubleMinimum energy to preserve (default: 0.95 = 95%).
maxIterationsintMaximum iterations for power method (default: 100).
tolerancedoubleConvergence tolerance (default: 1e-6).
Remarks
For Beginners: These parameters control the factorization:
targetRank: How many dimensions to keep
- Lower rank = more compression but potentially less accuracy
- If 0, automatically determined by energyThreshold
energyThreshold: What fraction of "information" to preserve
- 0.95 = keep 95% of the variance (recommended)
- 0.99 = keep 99% (higher quality, less compression)
- 0.90 = keep 90% (more compression, lower quality)
maxIterations/tolerance: Control the numerical algorithm
- Defaults work well for most cases
Methods
Compress(Vector<T>)
Compresses weights using low-rank factorization.
public override (Vector<T> compressedWeights, ICompressionMetadata<T> metadata) Compress(Vector<T> weights)
Parameters
weightsVector<T>The original model weights.
Returns
- (Vector<T> compressedWeights, ICompressionMetadata<T> metadata)
Factored representation and metadata.
Decompress(Vector<T>, ICompressionMetadata<T>)
Decompresses by reconstructing from U, S, V factors.
public override Vector<T> Decompress(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)
Parameters
compressedWeightsVector<T>metadataICompressionMetadata<T>
Returns
- Vector<T>
GetCompressedSize(Vector<T>, ICompressionMetadata<T>)
Gets the compressed size.
public override long GetCompressedSize(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)
Parameters
compressedWeightsVector<T>metadataICompressionMetadata<T>