Class WeightClusteringCompression<T>

Namespace: AiDotNet.ModelCompression

Assembly: AiDotNet.dll

Implements weight clustering compression using K-means clustering to group similar weights.

public class WeightClusteringCompression<T> : ModelCompressionBase<T>, IModelCompressionStrategy<T>

Type Parameters

T: The numeric type used for calculations (e.g., float, double).

Inheritance: object

ModelCompressionBase<T>

WeightClusteringCompression<T>

Implements: IModelCompressionStrategy<T>

Inherited Members: ModelCompressionBase<T>.NumOps

ModelCompressionBase<T>.CalculateCompressionRatio(long, long)

ModelCompressionBase<T>.CompressMatrix(Matrix<T>)

ModelCompressionBase<T>.DecompressMatrix(Matrix<T>, ICompressionMetadata<T>)

ModelCompressionBase<T>.GetCompressedSize(Matrix<T>, ICompressionMetadata<T>)

ModelCompressionBase<T>.CompressTensor(Tensor<T>)

ModelCompressionBase<T>.DecompressTensor(Tensor<T>, ICompressionMetadata<T>)

ModelCompressionBase<T>.GetCompressedSize(Tensor<T>, ICompressionMetadata<T>)

ModelCompressionBase<T>.GetElementSize()

ModelCompressionBase<T>.MatrixToVector(Matrix<T>)

ModelCompressionBase<T>.VectorToMatrix(Vector<T>, int, int)

ModelCompressionBase<T>.TensorToVector(Tensor<T>)

ModelCompressionBase<T>.VectorToTensor(Vector<T>, int[])

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

Weight clustering reduces model size by identifying groups of similar weight values and replacing them with their cluster representatives. This technique can achieve significant compression ratios (10-50x) while maintaining model accuracy.

For Beginners: Weight clustering is like organizing a messy toolbox.

Imagine you have thousands of screws that are almost the same size:

Some are 2.01mm, some 2.02mm, some 2.03mm, etc.
Instead of keeping track of each exact size, you group similar sizes together
You replace all sizes in a group with one representative size (like 2.0mm)

For neural networks:

Instead of storing millions of slightly different weight values
We group similar weights into clusters (like 256 or 512 groups)
Each weight is replaced with its cluster center
Instead of storing millions of unique values, we store which cluster each weight belongs to

This dramatically reduces storage because:

Cluster IDs are much smaller than full weight values (8 bits vs 32 bits)
We only need to store the cluster centers once

The result is a much smaller model that performs almost the same as the original!

Constructors

WeightClusteringCompression(int, int, double, int?)

Initializes a new instance of the WeightClusteringCompression class.

public WeightClusteringCompression(int numClusters = 256, int maxIterations = 100, double tolerance = 1E-06, int? randomSeed = null)

Parameters

numClusters int: The number of clusters to use (default: 256 for 8-bit quantization).
maxIterations int: The maximum number of K-means iterations (default: 100).
tolerance double: The convergence tolerance for K-means (default: 1e-6).
randomSeed int?: Random seed for reproducibility (default: null for random).

Remarks

For Beginners: These parameters control the compression behavior:

numClusters: How many groups to create
- 256 clusters = 8-bit compression (very common, good balance)
- 128 clusters = 7-bit compression (more aggressive)
- 512 clusters = 9-bit compression (less aggressive, higher quality)
maxIterations: How hard to try finding the best clusters
- Higher = better clusters but slower compression
- 100 is usually plenty
tolerance: When to stop improving clusters
- Smaller = more precise but takes longer
- 1e-6 is a good default
randomSeed: For getting the same results each time (useful for testing)

Methods

Compress(Vector<T>)

Compresses weights using K-means clustering.

public override (Vector<T> compressedWeights, ICompressionMetadata<T> metadata) Compress(Vector<T> weights)

Parameters

weights Vector<T>: The original model weights.

Returns

(Vector<T> compressedWeights, ICompressionMetadata<T> metadata): Compressed weights and metadata containing cluster centers and assignments.

Decompress(Vector<T>, ICompressionMetadata<T>)

Decompresses weights by mapping cluster assignments back to cluster centers.

public override Vector<T> Decompress(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)

Parameters

compressedWeights Vector<T>: The compressed weights (cluster assignments).
metadata ICompressionMetadata<T>: The metadata containing cluster centers.

Returns

Vector<T>: The decompressed weights.

GetCompressedSize(Vector<T>, ICompressionMetadata<T>)

Gets the compressed size including cluster centers and assignments.

public override long GetCompressedSize(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)

Parameters

compressedWeights Vector<T>: The compressed weights.
metadata ICompressionMetadata<T>: The compression metadata.

Returns

long: The total size in bytes.

Table of Contents

Class WeightClusteringCompression<T>

Type Parameters

Remarks

Constructors

WeightClusteringCompression(int, int, double, int?)

Parameters

Remarks

Methods

Compress(Vector<T>)

Parameters

Returns

Decompress(Vector<T>, ICompressionMetadata<T>)

Parameters

Returns

GetCompressedSize(Vector<T>, ICompressionMetadata<T>)

Parameters

Returns