Table of Contents

Class CompressionMetrics<T>

Namespace
AiDotNet.ModelCompression
Assembly
AiDotNet.dll

Provides metrics and statistics for model compression operations.

public class CompressionMetrics<T>

Type Parameters

T

The numeric type used for calculations (e.g., float, double).

Inheritance
CompressionMetrics<T>
Inherited Members

Remarks

CompressionMetrics tracks important statistics about the compression process, including compression ratio, model size reduction, inference speed impact, and accuracy preservation. These metrics help evaluate the effectiveness of different compression strategies.

For Beginners: CompressionMetrics helps you measure how well compression worked.

When you compress a model, you want to know:

  • How much smaller did it get? (compression ratio)
  • How much memory did we save? (size reduction)
  • Did it get faster? (inference speed)
  • Is it still accurate? (accuracy preservation)

This class tracks all these important measurements so you can:

  • Compare different compression techniques
  • Decide if the compression is worth it
  • Find the best balance between size and accuracy

Example:

  • Original model: 100 MB, 95% accuracy, 10ms inference
  • Compressed model: 10 MB, 94% accuracy, 5ms inference
  • Metrics show: 10x compression, 1% accuracy loss, 2x speedup
  • Conclusion: Great compression! The small accuracy loss is worth the huge size reduction.

Properties

AccuracyLoss

Gets or sets the accuracy loss percentage.

public T AccuracyLoss { get; set; }

Property Value

T

Remarks

For Beginners: This shows how much accuracy was lost due to compression.

Formula: original accuracy - compressed accuracy

Examples:

  • 0% = no accuracy loss (perfect!)
  • 1% = slight loss (usually acceptable)
  • 5% = significant loss (might be too much)

The goal is to keep this under 2% for most applications. If you lose more than that, you might need to use less aggressive compression or a different technique.

BitsPerWeight

Gets or sets the number of bits per weight after quantization.

public T BitsPerWeight { get; set; }

Property Value

T

Remarks

For Beginners: Shows how many bits are used to represent each weight.

Examples:

  • 32 bits = full precision (float)
  • 16 bits = half precision
  • 8 bits = int8 quantization
  • 5 bits = aggressive quantization (32 clusters)

Lower bits = more compression but potentially less accuracy.

CompressedAccuracy

Gets or sets the compressed model accuracy (after compression).

public T CompressedAccuracy { get; set; }

Property Value

T

CompressedInferenceTimeMs

Gets or sets the compressed model inference time in milliseconds.

public T CompressedInferenceTimeMs { get; set; }

Property Value

T

CompressedSize

Gets or sets the compressed model size in bytes.

public long CompressedSize { get; set; }

Property Value

long

CompressionRatio

Gets or sets the compression ratio (original size / compressed size).

public T CompressionRatio { get; set; }

Property Value

T

Remarks

For Beginners: The compression ratio shows how much smaller the model became.

Examples:

  • Ratio of 2.0 = model is half the size (50% reduction)
  • Ratio of 10.0 = model is 1/10th the size (90% reduction)
  • Ratio of 50.0 = model is 1/50th the size (98% reduction)

Higher is better! A ratio of 20 means you reduced the model to 5% of its original size.

CompressionTechnique

Gets or sets the compression technique used.

public string CompressionTechnique { get; set; }

Property Value

string

CompressionTimeMs

Gets or sets the time taken to perform compression in milliseconds.

public T CompressionTimeMs { get; set; }

Property Value

T

DecompressionTimeMs

Gets or sets the time taken to decompress in milliseconds.

public T DecompressionTimeMs { get; set; }

Property Value

T

EffectiveParameterCount

Gets or sets the effective number of unique parameters after compression.

public long EffectiveParameterCount { get; set; }

Property Value

long

InferenceSpeedup

Gets or sets the inference speedup factor.

public T InferenceSpeedup { get; set; }

Property Value

T

Remarks

For Beginners: This shows how much faster (or slower) the compressed model is.

Formula: original time / compressed time

Examples:

  • 1.0 = same speed
  • 2.0 = twice as fast
  • 0.5 = half as fast (slower due to decompression overhead)

Compression usually makes models faster because there's less data to move around, but sometimes decompression adds overhead.

MemoryBandwidthSavings

Gets or sets the memory bandwidth savings ratio.

public T MemoryBandwidthSavings { get; set; }

Property Value

T

Remarks

For Beginners: Shows how much memory bandwidth is saved during inference.

Memory bandwidth is often the bottleneck for neural network inference. Smaller models need less data moved from memory, making inference faster.

OriginalAccuracy

Gets or sets the original model accuracy (before compression).

public T OriginalAccuracy { get; set; }

Property Value

T

OriginalInferenceTimeMs

Gets or sets the original inference time in milliseconds.

public T OriginalInferenceTimeMs { get; set; }

Property Value

T

OriginalParameterCount

Gets or sets the number of parameters in the original model.

public long OriginalParameterCount { get; set; }

Property Value

long

OriginalSize

Gets or sets the original model size in bytes.

public long OriginalSize { get; set; }

Property Value

long

ReconstructionError

Gets or sets the reconstruction error (for lossy compression).

public T ReconstructionError { get; set; }

Property Value

T

Remarks

For Beginners: Shows the average error when decompressing weights.

For lossy compression techniques (like quantization), the decompressed weights are approximations of the original. This metric measures that approximation error. Lower is better.

SizeReductionPercentage

Gets or sets the percentage of size reduction.

public T SizeReductionPercentage { get; set; }

Property Value

T

Remarks

For Beginners: This shows the size reduction as a percentage.

Formula: (1 - compressed size / original size) × 100%

Examples:

  • 50% = model is half the original size
  • 90% = model is 1/10th the original size
  • 98% = model is 1/50th the original size

This is just another way to express the compression ratio, often easier to understand.

Sparsity

Gets or sets the sparsity level achieved (fraction of zero weights).

public T Sparsity { get; set; }

Property Value

T

Remarks

For Beginners: Sparsity shows what fraction of weights are zero after pruning.

Examples:

  • 0.0 = no zeros (dense model)
  • 0.9 = 90% zeros (very sparse)
  • 0.99 = 99% zeros (extremely sparse)

Higher sparsity means better compression potential but may affect accuracy.

Methods

CalculateCompositeFitness(double, double, double)

Calculates a composite fitness score for multi-objective optimization.

public T CalculateCompositeFitness(double accuracyWeight = 0.5, double compressionWeight = 0.3, double speedWeight = 0.2)

Parameters

accuracyWeight double

Weight for accuracy preservation (default: 0.5).

compressionWeight double

Weight for compression ratio (default: 0.3).

speedWeight double

Weight for inference speedup (default: 0.2).

Returns

T

A composite fitness score where higher is better.

Remarks

For Beginners: This calculates a single score that balances multiple objectives.

When optimizing compression, we care about multiple things:

  • High accuracy preservation (less accuracy loss = good)
  • High compression ratio (smaller model = good)
  • Fast inference (more speedup = good)

The weights control how much each factor matters. Default weights prioritize:

  • 50% accuracy preservation
  • 30% compression ratio
  • 20% inference speed

This is useful for AutoML and genetic algorithms that need a single fitness value.

CalculateDerivedMetrics()

Calculates all derived metrics from the base measurements.

public void CalculateDerivedMetrics()

Remarks

For Beginners: This method calculates all the ratios and percentages automatically.

Call this after setting the base values:

  • OriginalSize and CompressedSize
  • OriginalInferenceTimeMs and CompressedInferenceTimeMs
  • OriginalAccuracy and CompressedAccuracy

It will then calculate:

  • CompressionRatio
  • SizeReductionPercentage
  • InferenceSpeedup
  • AccuracyLoss
  • MemoryBandwidthSavings

FromDeepCompressionStats(DeepCompressionStats, string)

Creates a CompressionMetrics instance from a DeepCompressionStats object.

public static CompressionMetrics<T> FromDeepCompressionStats(DeepCompressionStats stats, string technique = "Deep Compression")

Parameters

stats DeepCompressionStats

The DeepCompressionStats containing compression statistics.

technique string

The name of the compression technique used.

Returns

CompressionMetrics<T>

A populated CompressionMetrics instance.

IsBetterThan(CompressionMetrics<T>, double, double, double)

Compares this compression result to another and determines which is better.

public bool IsBetterThan(CompressionMetrics<T> other, double accuracyWeight = 0.5, double compressionWeight = 0.3, double speedWeight = 0.2)

Parameters

other CompressionMetrics<T>

The other compression metrics to compare against.

accuracyWeight double

Weight for accuracy preservation.

compressionWeight double

Weight for compression ratio.

speedWeight double

Weight for inference speedup.

Returns

bool

True if this compression is better than the other.

MeetsQualityThreshold(double, double)

Determines if the compression meets the specified quality threshold using default values.

public bool MeetsQualityThreshold(double maxAccuracyLossPercentage = 2, double minCompressionRatio = 2)

Parameters

maxAccuracyLossPercentage double

Maximum acceptable accuracy loss percentage (default: 2.0).

minCompressionRatio double

Minimum acceptable compression ratio (default: 2.0).

Returns

bool

True if compression meets the quality criteria, false otherwise.

MeetsQualityThreshold(T, T)

Determines if the compression meets the specified quality threshold.

public bool MeetsQualityThreshold(T maxAccuracyLossPercentage, T minCompressionRatio)

Parameters

maxAccuracyLossPercentage T

Maximum acceptable accuracy loss (default: 2%).

minCompressionRatio T

Minimum acceptable compression ratio (default: 2x).

Returns

bool

True if compression meets the quality criteria, false otherwise.

Remarks

For Beginners: This method checks if the compression is "good enough".

It verifies two things:

  1. Accuracy loss is acceptable (not too much)
  2. Compression is significant enough (worthwhile)

Example thresholds:

  • maxAccuracyLossPercentage = 2% means we accept up to 2% accuracy loss
  • minCompressionRatio = 2x means we want at least 50% size reduction

If both conditions are met, the compression is considered successful.

ToString()

Gets a human-readable summary of the compression metrics.

public override string ToString()

Returns

string

A formatted string containing all metrics.