Table of Contents

Class ModelCompressionBase<T>

Namespace
AiDotNet.ModelCompression
Assembly
AiDotNet.dll

Provides a base implementation for model compression techniques used to reduce model size while preserving accuracy.

public abstract class ModelCompressionBase<T> : IModelCompressionStrategy<T>

Type Parameters

T

The numeric type used for calculations (e.g., float, double).

Inheritance
ModelCompressionBase<T>
Implements
Derived
Inherited Members

Remarks

ModelCompressionBase serves as an abstract foundation for implementing various compression strategies. Model compression reduces the storage and computational requirements of machine learning models, making them more suitable for deployment on resource-constrained devices or in bandwidth-limited environments.

For Beginners: Think of model compression as packing for a trip - you want to fit everything you need into a smaller suitcase.

When you train an AI model:

  • It learns millions or billions of parameters (weights)
  • These weights need to be stored and loaded when making predictions
  • Larger models are slower and use more memory

Model compression helps by:

  • Reducing the file size (easier to download and store)
  • Speeding up predictions (less data to process)
  • Enabling deployment on phones, tablets, or embedded devices
  • Lowering costs in cloud environments

This base class provides the common structure that all compression techniques share. Different compression approaches (like weight clustering, quantization, or Huffman coding) work in different ways, but they all aim to make your model smaller and faster while keeping it accurate.

Constructors

ModelCompressionBase()

Initializes a new instance of the ModelCompressionBase class.

protected ModelCompressionBase()

Remarks

This constructor initializes the base class for a compression implementation, setting up the numeric operations required for mathematical calculations.

For Beginners: This sets up the foundation for any type of compression.

When creating a compression object, it gets the right calculator for the numeric type being used. This is like preparing your workspace before starting a project - gathering the tools you'll need.

Fields

NumOps

Provides numeric operations appropriate for the generic type T.

protected readonly INumericOperations<T> NumOps

Field Value

INumericOperations<T>

Remarks

This field holds a reference to the appropriate numeric operations implementation for the generic type T, allowing the compression methods to perform mathematical operations regardless of whether T is float, double, or another numeric type.

For Beginners: This is a helper that allows the code to work with different number types.

Since this class uses a generic type T (which could be float, double, etc.):

  • We need a way to perform math operations (+, -, *, /) on these values
  • NumOps provides the right methods for whatever numeric type is being used

Think of it like having different calculators for different types of numbers, and NumOps makes sure we're using the right calculator for the job.

Methods

CalculateCompressionRatio(long, long)

Calculates the compression ratio achieved.

public virtual double CalculateCompressionRatio(long originalSize, long compressedSize)

Parameters

originalSize long

The original size in bytes.

compressedSize long

The compressed size in bytes.

Returns

double

The compression ratio (original size / compressed size).

Remarks

For Beginners: This calculates how much smaller the compressed model is.

The formula is simple: compression ratio = original size ÷ compressed size

Examples:

  • Original: 1000 MB, Compressed: 100 MB → Ratio: 10.0 (90% smaller)
  • Original: 500 MB, Compressed: 250 MB → Ratio: 2.0 (50% smaller)

Higher ratios mean better compression!

Compress(Vector<T>)

Compresses the given model weights.

public abstract (Vector<T> compressedWeights, ICompressionMetadata<T> metadata) Compress(Vector<T> weights)

Parameters

weights Vector<T>

The original model weights to compress.

Returns

(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)

A tuple containing the compressed weights and compression metadata.

CompressMatrix(Matrix<T>)

Compresses a 2D matrix of weights. Default implementation flattens to vector, compresses, and reshapes back.

public virtual (Matrix<T> compressedWeights, ICompressionMetadata<T> metadata) CompressMatrix(Matrix<T> weights)

Parameters

weights Matrix<T>

The original weight matrix to compress.

Returns

(Matrix<T> compressedWeights, ICompressionMetadata<T> metadata)

A tuple containing the compressed weights and compression metadata.

Remarks

For Beginners: This method compresses a 2D weight matrix by:

  1. Flattening the matrix into a 1D vector (row by row)
  2. Applying the vector compression algorithm
  3. Wrapping the result with shape information

The compressed matrix maintains the original dimensions for convenience, but the actual size reduction comes from the underlying vector compression.

CompressTensor(Tensor<T>)

Compresses an N-dimensional tensor of weights. Default implementation flattens to vector, compresses, and reshapes back.

public virtual (Tensor<T> compressedWeights, ICompressionMetadata<T> metadata) CompressTensor(Tensor<T> weights)

Parameters

weights Tensor<T>

The original weight tensor to compress.

Returns

(Tensor<T> compressedWeights, ICompressionMetadata<T> metadata)

A tuple containing the compressed weights and compression metadata.

Remarks

For Beginners: This method compresses an N-dimensional tensor by:

  1. Flattening the tensor into a 1D vector
  2. Applying the vector compression algorithm
  3. Wrapping the result with shape information

Tensors are essential for convolutional layers (4D: [filters, channels, height, width]) and attention mechanisms. This method preserves the shape for reconstruction.

Decompress(Vector<T>, ICompressionMetadata<T>)

Decompresses the compressed weights back to their original form.

public abstract Vector<T> Decompress(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)

Parameters

compressedWeights Vector<T>

The compressed weights.

metadata ICompressionMetadata<T>

The metadata needed for decompression.

Returns

Vector<T>

The decompressed weights.

DecompressMatrix(Matrix<T>, ICompressionMetadata<T>)

Decompresses the compressed matrix weights back to their original form.

public virtual Matrix<T> DecompressMatrix(Matrix<T> compressedWeights, ICompressionMetadata<T> metadata)

Parameters

compressedWeights Matrix<T>

The compressed weight matrix.

metadata ICompressionMetadata<T>

The metadata needed for decompression.

Returns

Matrix<T>

The decompressed weight matrix.

Remarks

For Beginners: This method reverses the compression process:

  1. Flattens the compressed matrix to a vector
  2. Applies the vector decompression algorithm
  3. Reshapes back to the original matrix dimensions

The metadata must be the same type returned by CompressMatrix.

DecompressTensor(Tensor<T>, ICompressionMetadata<T>)

Decompresses the compressed tensor weights back to their original form.

public virtual Tensor<T> DecompressTensor(Tensor<T> compressedWeights, ICompressionMetadata<T> metadata)

Parameters

compressedWeights Tensor<T>

The compressed weight tensor.

metadata ICompressionMetadata<T>

The metadata needed for decompression.

Returns

Tensor<T>

The decompressed weight tensor.

Remarks

For Beginners: This method reverses the compression process:

  1. Flattens the compressed tensor to a vector
  2. Applies the vector decompression algorithm
  3. Reshapes back to the original tensor dimensions

The metadata must be the same type returned by CompressTensor.

GetCompressedSize(Matrix<T>, ICompressionMetadata<T>)

Gets the size in bytes of the compressed matrix representation.

public virtual long GetCompressedSize(Matrix<T> compressedWeights, ICompressionMetadata<T> metadata)

Parameters

compressedWeights Matrix<T>

The compressed weight matrix.

metadata ICompressionMetadata<T>

The compression metadata.

Returns

long

The total size in bytes.

Remarks

For Beginners: This calculates the total storage needed for the compressed matrix, including both the compressed data and the metadata overhead (shape information).

GetCompressedSize(Tensor<T>, ICompressionMetadata<T>)

Gets the size in bytes of the compressed tensor representation.

public virtual long GetCompressedSize(Tensor<T> compressedWeights, ICompressionMetadata<T> metadata)

Parameters

compressedWeights Tensor<T>

The compressed weight tensor.

metadata ICompressionMetadata<T>

The compression metadata.

Returns

long

The total size in bytes.

Remarks

For Beginners: This calculates the total storage needed for the compressed tensor, including both the compressed data and the metadata overhead (shape information).

GetCompressedSize(Vector<T>, ICompressionMetadata<T>)

Gets the size in bytes of the compressed representation.

public abstract long GetCompressedSize(Vector<T> compressedWeights, ICompressionMetadata<T> metadata)

Parameters

compressedWeights Vector<T>

The compressed weights.

metadata ICompressionMetadata<T>

The compression metadata.

Returns

long

The total size in bytes.

GetElementSize()

Gets the size in bytes of a value of type T.

protected virtual int GetElementSize()

Returns

int

The size in bytes.

Remarks

For Beginners: Different number types take different amounts of memory.

Common sizes:

  • float (single precision): 4 bytes
  • double (double precision): 8 bytes

This method figures out the size automatically based on the type being used.

MatrixToVector(Matrix<T>)

Converts a matrix to a flattened vector.

protected Vector<T> MatrixToVector(Matrix<T> matrix)

Parameters

matrix Matrix<T>

Returns

Vector<T>

TensorToVector(Tensor<T>)

Converts a tensor to a flattened vector.

protected Vector<T> TensorToVector(Tensor<T> tensor)

Parameters

tensor Tensor<T>

Returns

Vector<T>

Remarks

For Beginners: This method flattens an N-dimensional tensor into a 1D vector, preserving all values in row-major order. This is the first step in tensor compression - convert to 1D, compress, then convert back.

VectorToMatrix(Vector<T>, int, int)

Converts a vector to a matrix with specified dimensions.

protected Matrix<T> VectorToMatrix(Vector<T> vector, int rows, int cols)

Parameters

vector Vector<T>
rows int
cols int

Returns

Matrix<T>

VectorToTensor(Vector<T>, int[])

Converts a vector to a tensor with specified shape.

protected Tensor<T> VectorToTensor(Vector<T> vector, int[] shape)

Parameters

vector Vector<T>
shape int[]

Returns

Tensor<T>

Remarks

For Beginners: This method reshapes a 1D vector into an N-dimensional tensor with the specified shape. The vector values are placed in row-major order into the tensor. The total number of elements in the vector must match the product of all shape dimensions.