Table of Contents

Class LowRankFactorizationCompression<T>

Namespace
AiDotNet.ModelCompression
Assembly
AiDotNet.dll

Implements Low-Rank Factorization compression using SVD-like decomposition.

public class LowRankFactorizationCompression<T> : ModelCompressionBase<T>, IModelCompressionStrategy<T>

Type Parameters

T

The numeric type used for calculations (e.g., float, double).

Inheritance
LowRankFactorizationCompression<T>
Implements
Inherited Members

Remarks

Low-Rank Factorization approximates weight matrices by decomposing them into products of smaller matrices. This is based on the observation that many neural network weight matrices are approximately low-rank, meaning they can be represented with fewer parameters.

For Beginners: Low-Rank Factorization is like summarizing a book.

The concept:

  • A weight matrix might be 1000×1000 = 1,000,000 parameters
  • But the actual "information content" might be much smaller
  • We can approximate it as: W ≈ A × B where A is 1000×50 and B is 50×1000
  • Now we only store: 50,000 + 50,000 = 100,000 parameters (10x compression!)

How it works:

  1. Treat the weight vector as a matrix (reshape it)
  2. Perform approximate factorization (similar to SVD)
  3. Keep only the top-k singular values/vectors
  4. Store the factored matrices instead of the original

Benefits:

  • Compression ratio is controlled by the rank k
  • Works especially well for fully-connected layers
  • Maintains smoothness in the weight space

Trade-offs:

  • Need to choose the rank k (compression vs accuracy trade-off)
  • Works best when weights have inherent low-rank structure

Constructors

LowRankFactorizationCompression(int, double, int, double)

Initializes a new instance of the LowRankFactorizationCompression class.

public LowRankFactorizationCompression(int targetRank = 0, double energyThreshold = 0.95, int maxIterations = 100, double tolerance = 1E-06)

Parameters

targetRank int

Target rank for the factorization (default: 0 = auto based on energy).

energyThreshold double

Minimum energy to preserve (default: 0.95 = 95%).

maxIterations int

Maximum iterations for power method (default: 100).

tolerance double

Convergence tolerance (default: 1e-6).

Remarks

For Beginners: These parameters control the factorization:

  • targetRank: How many dimensions to keep

    • Lower rank = more compression but potentially less accuracy
    • If 0, automatically determined by energyThreshold
  • energyThreshold: What fraction of "information" to preserve

    • 0.95 = keep 95% of the variance (recommended)
    • 0.99 = keep 99% (higher quality, less compression)
    • 0.90 = keep 90% (more compression, lower quality)
  • maxIterations/tolerance: Control the numerical algorithm

    • Defaults work well for most cases

Methods

Compress(Vector<T>)

Compresses weights using low-rank factorization.

public override (Vector<T> compressedWeights, ICompressionMetadata<T> metadata) Compress(Vector<T> weights)

Parameters

weights Vector<T>

The original model weights.

Returns

(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)

Factored representation and metadata.

Decompress(Vector<T>, ICompressionMetadata<T>)

Decompresses by reconstructing from U, S, V factors.

public override Vector<T> Decompress(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)

Parameters

compressedWeights Vector<T>
metadata ICompressionMetadata<T>

Returns

Vector<T>

GetCompressedSize(Vector<T>, ICompressionMetadata<T>)

Gets the compressed size.

public override long GetCompressedSize(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)

Parameters

compressedWeights Vector<T>
metadata ICompressionMetadata<T>

Returns

long