Class WeightClusteringCompression<T>
- Namespace
- AiDotNet.ModelCompression
- Assembly
- AiDotNet.dll
Implements weight clustering compression using K-means clustering to group similar weights.
public class WeightClusteringCompression<T> : ModelCompressionBase<T>, IModelCompressionStrategy<T>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
- Inheritance
-
WeightClusteringCompression<T>
- Implements
- Inherited Members
Remarks
Weight clustering reduces model size by identifying groups of similar weight values and replacing them with their cluster representatives. This technique can achieve significant compression ratios (10-50x) while maintaining model accuracy.
For Beginners: Weight clustering is like organizing a messy toolbox.
Imagine you have thousands of screws that are almost the same size:
- Some are 2.01mm, some 2.02mm, some 2.03mm, etc.
- Instead of keeping track of each exact size, you group similar sizes together
- You replace all sizes in a group with one representative size (like 2.0mm)
For neural networks:
- Instead of storing millions of slightly different weight values
- We group similar weights into clusters (like 256 or 512 groups)
- Each weight is replaced with its cluster center
- Instead of storing millions of unique values, we store which cluster each weight belongs to
This dramatically reduces storage because:
- Cluster IDs are much smaller than full weight values (8 bits vs 32 bits)
- We only need to store the cluster centers once
The result is a much smaller model that performs almost the same as the original!
Constructors
WeightClusteringCompression(int, int, double, int?)
Initializes a new instance of the WeightClusteringCompression class.
public WeightClusteringCompression(int numClusters = 256, int maxIterations = 100, double tolerance = 1E-06, int? randomSeed = null)
Parameters
numClustersintThe number of clusters to use (default: 256 for 8-bit quantization).
maxIterationsintThe maximum number of K-means iterations (default: 100).
tolerancedoubleThe convergence tolerance for K-means (default: 1e-6).
randomSeedint?Random seed for reproducibility (default: null for random).
Remarks
For Beginners: These parameters control the compression behavior:
numClusters: How many groups to create
- 256 clusters = 8-bit compression (very common, good balance)
- 128 clusters = 7-bit compression (more aggressive)
- 512 clusters = 9-bit compression (less aggressive, higher quality)
maxIterations: How hard to try finding the best clusters
- Higher = better clusters but slower compression
- 100 is usually plenty
tolerance: When to stop improving clusters
- Smaller = more precise but takes longer
- 1e-6 is a good default
randomSeed: For getting the same results each time (useful for testing)
Methods
Compress(Vector<T>)
Compresses weights using K-means clustering.
public override (Vector<T> compressedWeights, ICompressionMetadata<T> metadata) Compress(Vector<T> weights)
Parameters
weightsVector<T>The original model weights.
Returns
- (Vector<T> compressedWeights, ICompressionMetadata<T> metadata)
Compressed weights and metadata containing cluster centers and assignments.
Decompress(Vector<T>, ICompressionMetadata<T>)
Decompresses weights by mapping cluster assignments back to cluster centers.
public override Vector<T> Decompress(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)
Parameters
compressedWeightsVector<T>The compressed weights (cluster assignments).
metadataICompressionMetadata<T>The metadata containing cluster centers.
Returns
- Vector<T>
The decompressed weights.
GetCompressedSize(Vector<T>, ICompressionMetadata<T>)
Gets the compressed size including cluster centers and assignments.
public override long GetCompressedSize(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)
Parameters
compressedWeightsVector<T>The compressed weights.
metadataICompressionMetadata<T>The compression metadata.
Returns
- long
The total size in bytes.