Class ModelCompressionBase<T>
- Namespace
- AiDotNet.ModelCompression
- Assembly
- AiDotNet.dll
Provides a base implementation for model compression techniques used to reduce model size while preserving accuracy.
public abstract class ModelCompressionBase<T> : IModelCompressionStrategy<T>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
- Inheritance
-
ModelCompressionBase<T>
- Implements
- Derived
- Inherited Members
Remarks
ModelCompressionBase serves as an abstract foundation for implementing various compression strategies. Model compression reduces the storage and computational requirements of machine learning models, making them more suitable for deployment on resource-constrained devices or in bandwidth-limited environments.
For Beginners: Think of model compression as packing for a trip - you want to fit everything you need into a smaller suitcase.
When you train an AI model:
- It learns millions or billions of parameters (weights)
- These weights need to be stored and loaded when making predictions
- Larger models are slower and use more memory
Model compression helps by:
- Reducing the file size (easier to download and store)
- Speeding up predictions (less data to process)
- Enabling deployment on phones, tablets, or embedded devices
- Lowering costs in cloud environments
This base class provides the common structure that all compression techniques share. Different compression approaches (like weight clustering, quantization, or Huffman coding) work in different ways, but they all aim to make your model smaller and faster while keeping it accurate.
Constructors
ModelCompressionBase()
Initializes a new instance of the ModelCompressionBase class.
protected ModelCompressionBase()
Remarks
This constructor initializes the base class for a compression implementation, setting up the numeric operations required for mathematical calculations.
For Beginners: This sets up the foundation for any type of compression.
When creating a compression object, it gets the right calculator for the numeric type being used. This is like preparing your workspace before starting a project - gathering the tools you'll need.
Fields
NumOps
Provides numeric operations appropriate for the generic type T.
protected readonly INumericOperations<T> NumOps
Field Value
- INumericOperations<T>
Remarks
This field holds a reference to the appropriate numeric operations implementation for the generic type T, allowing the compression methods to perform mathematical operations regardless of whether T is float, double, or another numeric type.
For Beginners: This is a helper that allows the code to work with different number types.
Since this class uses a generic type T (which could be float, double, etc.):
- We need a way to perform math operations (+, -, *, /) on these values
- NumOps provides the right methods for whatever numeric type is being used
Think of it like having different calculators for different types of numbers, and NumOps makes sure we're using the right calculator for the job.
Methods
CalculateCompressionRatio(long, long)
Calculates the compression ratio achieved.
public virtual double CalculateCompressionRatio(long originalSize, long compressedSize)
Parameters
Returns
- double
The compression ratio (original size / compressed size).
Remarks
For Beginners: This calculates how much smaller the compressed model is.
The formula is simple: compression ratio = original size ÷ compressed size
Examples:
- Original: 1000 MB, Compressed: 100 MB → Ratio: 10.0 (90% smaller)
- Original: 500 MB, Compressed: 250 MB → Ratio: 2.0 (50% smaller)
Higher ratios mean better compression!
Compress(Vector<T>)
Compresses the given model weights.
public abstract (Vector<T> compressedWeights, ICompressionMetadata<T> metadata) Compress(Vector<T> weights)
Parameters
weightsVector<T>The original model weights to compress.
Returns
- (Vector<T> compressedWeights, ICompressionMetadata<T> metadata)
A tuple containing the compressed weights and compression metadata.
CompressMatrix(Matrix<T>)
Compresses a 2D matrix of weights. Default implementation flattens to vector, compresses, and reshapes back.
public virtual (Matrix<T> compressedWeights, ICompressionMetadata<T> metadata) CompressMatrix(Matrix<T> weights)
Parameters
weightsMatrix<T>The original weight matrix to compress.
Returns
- (Matrix<T> compressedWeights, ICompressionMetadata<T> metadata)
A tuple containing the compressed weights and compression metadata.
Remarks
For Beginners: This method compresses a 2D weight matrix by:
- Flattening the matrix into a 1D vector (row by row)
- Applying the vector compression algorithm
- Wrapping the result with shape information
The compressed matrix maintains the original dimensions for convenience, but the actual size reduction comes from the underlying vector compression.
CompressTensor(Tensor<T>)
Compresses an N-dimensional tensor of weights. Default implementation flattens to vector, compresses, and reshapes back.
public virtual (Tensor<T> compressedWeights, ICompressionMetadata<T> metadata) CompressTensor(Tensor<T> weights)
Parameters
weightsTensor<T>The original weight tensor to compress.
Returns
- (Tensor<T> compressedWeights, ICompressionMetadata<T> metadata)
A tuple containing the compressed weights and compression metadata.
Remarks
For Beginners: This method compresses an N-dimensional tensor by:
- Flattening the tensor into a 1D vector
- Applying the vector compression algorithm
- Wrapping the result with shape information
Tensors are essential for convolutional layers (4D: [filters, channels, height, width]) and attention mechanisms. This method preserves the shape for reconstruction.
Decompress(Vector<T>, ICompressionMetadata<T>)
Decompresses the compressed weights back to their original form.
public abstract Vector<T> Decompress(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)
Parameters
compressedWeightsVector<T>The compressed weights.
metadataICompressionMetadata<T>The metadata needed for decompression.
Returns
- Vector<T>
The decompressed weights.
DecompressMatrix(Matrix<T>, ICompressionMetadata<T>)
Decompresses the compressed matrix weights back to their original form.
public virtual Matrix<T> DecompressMatrix(Matrix<T> compressedWeights, ICompressionMetadata<T> metadata)
Parameters
compressedWeightsMatrix<T>The compressed weight matrix.
metadataICompressionMetadata<T>The metadata needed for decompression.
Returns
- Matrix<T>
The decompressed weight matrix.
Remarks
For Beginners: This method reverses the compression process:
- Flattens the compressed matrix to a vector
- Applies the vector decompression algorithm
- Reshapes back to the original matrix dimensions
The metadata must be the same type returned by CompressMatrix.
DecompressTensor(Tensor<T>, ICompressionMetadata<T>)
Decompresses the compressed tensor weights back to their original form.
public virtual Tensor<T> DecompressTensor(Tensor<T> compressedWeights, ICompressionMetadata<T> metadata)
Parameters
compressedWeightsTensor<T>The compressed weight tensor.
metadataICompressionMetadata<T>The metadata needed for decompression.
Returns
- Tensor<T>
The decompressed weight tensor.
Remarks
For Beginners: This method reverses the compression process:
- Flattens the compressed tensor to a vector
- Applies the vector decompression algorithm
- Reshapes back to the original tensor dimensions
The metadata must be the same type returned by CompressTensor.
GetCompressedSize(Matrix<T>, ICompressionMetadata<T>)
Gets the size in bytes of the compressed matrix representation.
public virtual long GetCompressedSize(Matrix<T> compressedWeights, ICompressionMetadata<T> metadata)
Parameters
compressedWeightsMatrix<T>The compressed weight matrix.
metadataICompressionMetadata<T>The compression metadata.
Returns
- long
The total size in bytes.
Remarks
For Beginners: This calculates the total storage needed for the compressed matrix, including both the compressed data and the metadata overhead (shape information).
GetCompressedSize(Tensor<T>, ICompressionMetadata<T>)
Gets the size in bytes of the compressed tensor representation.
public virtual long GetCompressedSize(Tensor<T> compressedWeights, ICompressionMetadata<T> metadata)
Parameters
compressedWeightsTensor<T>The compressed weight tensor.
metadataICompressionMetadata<T>The compression metadata.
Returns
- long
The total size in bytes.
Remarks
For Beginners: This calculates the total storage needed for the compressed tensor, including both the compressed data and the metadata overhead (shape information).
GetCompressedSize(Vector<T>, ICompressionMetadata<T>)
Gets the size in bytes of the compressed representation.
public abstract long GetCompressedSize(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)
Parameters
compressedWeightsVector<T>The compressed weights.
metadataICompressionMetadata<T>The compression metadata.
Returns
- long
The total size in bytes.
GetElementSize()
Gets the size in bytes of a value of type T.
protected virtual int GetElementSize()
Returns
- int
The size in bytes.
Remarks
For Beginners: Different number types take different amounts of memory.
Common sizes:
- float (single precision): 4 bytes
- double (double precision): 8 bytes
This method figures out the size automatically based on the type being used.
MatrixToVector(Matrix<T>)
Converts a matrix to a flattened vector.
protected Vector<T> MatrixToVector(Matrix<T> matrix)
Parameters
matrixMatrix<T>
Returns
- Vector<T>
TensorToVector(Tensor<T>)
Converts a tensor to a flattened vector.
protected Vector<T> TensorToVector(Tensor<T> tensor)
Parameters
tensorTensor<T>
Returns
- Vector<T>
Remarks
For Beginners: This method flattens an N-dimensional tensor into a 1D vector, preserving all values in row-major order. This is the first step in tensor compression - convert to 1D, compress, then convert back.
VectorToMatrix(Vector<T>, int, int)
Converts a vector to a matrix with specified dimensions.
protected Matrix<T> VectorToMatrix(Vector<T> vector, int rows, int cols)
Parameters
Returns
- Matrix<T>
VectorToTensor(Vector<T>, int[])
Converts a vector to a tensor with specified shape.
protected Tensor<T> VectorToTensor(Vector<T> vector, int[] shape)
Parameters
vectorVector<T>shapeint[]
Returns
- Tensor<T>
Remarks
For Beginners: This method reshapes a 1D vector into an N-dimensional tensor with the specified shape. The vector values are placed in row-major order into the tensor. The total number of elements in the vector must match the product of all shape dimensions.