Class CompressionMetrics<T>
- Namespace
- AiDotNet.ModelCompression
- Assembly
- AiDotNet.dll
Provides metrics and statistics for model compression operations.
public class CompressionMetrics<T>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
- Inheritance
-
CompressionMetrics<T>
- Inherited Members
Remarks
CompressionMetrics tracks important statistics about the compression process, including compression ratio, model size reduction, inference speed impact, and accuracy preservation. These metrics help evaluate the effectiveness of different compression strategies.
For Beginners: CompressionMetrics helps you measure how well compression worked.
When you compress a model, you want to know:
- How much smaller did it get? (compression ratio)
- How much memory did we save? (size reduction)
- Did it get faster? (inference speed)
- Is it still accurate? (accuracy preservation)
This class tracks all these important measurements so you can:
- Compare different compression techniques
- Decide if the compression is worth it
- Find the best balance between size and accuracy
Example:
- Original model: 100 MB, 95% accuracy, 10ms inference
- Compressed model: 10 MB, 94% accuracy, 5ms inference
- Metrics show: 10x compression, 1% accuracy loss, 2x speedup
- Conclusion: Great compression! The small accuracy loss is worth the huge size reduction.
Properties
AccuracyLoss
Gets or sets the accuracy loss percentage.
public T AccuracyLoss { get; set; }
Property Value
- T
Remarks
For Beginners: This shows how much accuracy was lost due to compression.
Formula: original accuracy - compressed accuracy
Examples:
- 0% = no accuracy loss (perfect!)
- 1% = slight loss (usually acceptable)
- 5% = significant loss (might be too much)
The goal is to keep this under 2% for most applications. If you lose more than that, you might need to use less aggressive compression or a different technique.
BitsPerWeight
Gets or sets the number of bits per weight after quantization.
public T BitsPerWeight { get; set; }
Property Value
- T
Remarks
For Beginners: Shows how many bits are used to represent each weight.
Examples:
- 32 bits = full precision (float)
- 16 bits = half precision
- 8 bits = int8 quantization
- 5 bits = aggressive quantization (32 clusters)
Lower bits = more compression but potentially less accuracy.
CompressedAccuracy
Gets or sets the compressed model accuracy (after compression).
public T CompressedAccuracy { get; set; }
Property Value
- T
CompressedInferenceTimeMs
Gets or sets the compressed model inference time in milliseconds.
public T CompressedInferenceTimeMs { get; set; }
Property Value
- T
CompressedSize
Gets or sets the compressed model size in bytes.
public long CompressedSize { get; set; }
Property Value
CompressionRatio
Gets or sets the compression ratio (original size / compressed size).
public T CompressionRatio { get; set; }
Property Value
- T
Remarks
For Beginners: The compression ratio shows how much smaller the model became.
Examples:
- Ratio of 2.0 = model is half the size (50% reduction)
- Ratio of 10.0 = model is 1/10th the size (90% reduction)
- Ratio of 50.0 = model is 1/50th the size (98% reduction)
Higher is better! A ratio of 20 means you reduced the model to 5% of its original size.
CompressionTechnique
Gets or sets the compression technique used.
public string CompressionTechnique { get; set; }
Property Value
CompressionTimeMs
Gets or sets the time taken to perform compression in milliseconds.
public T CompressionTimeMs { get; set; }
Property Value
- T
DecompressionTimeMs
Gets or sets the time taken to decompress in milliseconds.
public T DecompressionTimeMs { get; set; }
Property Value
- T
EffectiveParameterCount
Gets or sets the effective number of unique parameters after compression.
public long EffectiveParameterCount { get; set; }
Property Value
InferenceSpeedup
Gets or sets the inference speedup factor.
public T InferenceSpeedup { get; set; }
Property Value
- T
Remarks
For Beginners: This shows how much faster (or slower) the compressed model is.
Formula: original time / compressed time
Examples:
- 1.0 = same speed
- 2.0 = twice as fast
- 0.5 = half as fast (slower due to decompression overhead)
Compression usually makes models faster because there's less data to move around, but sometimes decompression adds overhead.
MemoryBandwidthSavings
Gets or sets the memory bandwidth savings ratio.
public T MemoryBandwidthSavings { get; set; }
Property Value
- T
Remarks
For Beginners: Shows how much memory bandwidth is saved during inference.
Memory bandwidth is often the bottleneck for neural network inference. Smaller models need less data moved from memory, making inference faster.
OriginalAccuracy
Gets or sets the original model accuracy (before compression).
public T OriginalAccuracy { get; set; }
Property Value
- T
OriginalInferenceTimeMs
Gets or sets the original inference time in milliseconds.
public T OriginalInferenceTimeMs { get; set; }
Property Value
- T
OriginalParameterCount
Gets or sets the number of parameters in the original model.
public long OriginalParameterCount { get; set; }
Property Value
OriginalSize
Gets or sets the original model size in bytes.
public long OriginalSize { get; set; }
Property Value
ReconstructionError
Gets or sets the reconstruction error (for lossy compression).
public T ReconstructionError { get; set; }
Property Value
- T
Remarks
For Beginners: Shows the average error when decompressing weights.
For lossy compression techniques (like quantization), the decompressed weights are approximations of the original. This metric measures that approximation error. Lower is better.
SizeReductionPercentage
Gets or sets the percentage of size reduction.
public T SizeReductionPercentage { get; set; }
Property Value
- T
Remarks
For Beginners: This shows the size reduction as a percentage.
Formula: (1 - compressed size / original size) × 100%
Examples:
- 50% = model is half the original size
- 90% = model is 1/10th the original size
- 98% = model is 1/50th the original size
This is just another way to express the compression ratio, often easier to understand.
Sparsity
Gets or sets the sparsity level achieved (fraction of zero weights).
public T Sparsity { get; set; }
Property Value
- T
Remarks
For Beginners: Sparsity shows what fraction of weights are zero after pruning.
Examples:
- 0.0 = no zeros (dense model)
- 0.9 = 90% zeros (very sparse)
- 0.99 = 99% zeros (extremely sparse)
Higher sparsity means better compression potential but may affect accuracy.
Methods
CalculateCompositeFitness(double, double, double)
Calculates a composite fitness score for multi-objective optimization.
public T CalculateCompositeFitness(double accuracyWeight = 0.5, double compressionWeight = 0.3, double speedWeight = 0.2)
Parameters
accuracyWeightdoubleWeight for accuracy preservation (default: 0.5).
compressionWeightdoubleWeight for compression ratio (default: 0.3).
speedWeightdoubleWeight for inference speedup (default: 0.2).
Returns
- T
A composite fitness score where higher is better.
Remarks
For Beginners: This calculates a single score that balances multiple objectives.
When optimizing compression, we care about multiple things:
- High accuracy preservation (less accuracy loss = good)
- High compression ratio (smaller model = good)
- Fast inference (more speedup = good)
The weights control how much each factor matters. Default weights prioritize:
- 50% accuracy preservation
- 30% compression ratio
- 20% inference speed
This is useful for AutoML and genetic algorithms that need a single fitness value.
CalculateDerivedMetrics()
Calculates all derived metrics from the base measurements.
public void CalculateDerivedMetrics()
Remarks
For Beginners: This method calculates all the ratios and percentages automatically.
Call this after setting the base values:
- OriginalSize and CompressedSize
- OriginalInferenceTimeMs and CompressedInferenceTimeMs
- OriginalAccuracy and CompressedAccuracy
It will then calculate:
- CompressionRatio
- SizeReductionPercentage
- InferenceSpeedup
- AccuracyLoss
- MemoryBandwidthSavings
FromDeepCompressionStats(DeepCompressionStats, string)
Creates a CompressionMetrics instance from a DeepCompressionStats object.
public static CompressionMetrics<T> FromDeepCompressionStats(DeepCompressionStats stats, string technique = "Deep Compression")
Parameters
statsDeepCompressionStatsThe DeepCompressionStats containing compression statistics.
techniquestringThe name of the compression technique used.
Returns
- CompressionMetrics<T>
A populated CompressionMetrics instance.
IsBetterThan(CompressionMetrics<T>, double, double, double)
Compares this compression result to another and determines which is better.
public bool IsBetterThan(CompressionMetrics<T> other, double accuracyWeight = 0.5, double compressionWeight = 0.3, double speedWeight = 0.2)
Parameters
otherCompressionMetrics<T>The other compression metrics to compare against.
accuracyWeightdoubleWeight for accuracy preservation.
compressionWeightdoubleWeight for compression ratio.
speedWeightdoubleWeight for inference speedup.
Returns
- bool
True if this compression is better than the other.
MeetsQualityThreshold(double, double)
Determines if the compression meets the specified quality threshold using default values.
public bool MeetsQualityThreshold(double maxAccuracyLossPercentage = 2, double minCompressionRatio = 2)
Parameters
maxAccuracyLossPercentagedoubleMaximum acceptable accuracy loss percentage (default: 2.0).
minCompressionRatiodoubleMinimum acceptable compression ratio (default: 2.0).
Returns
- bool
True if compression meets the quality criteria, false otherwise.
MeetsQualityThreshold(T, T)
Determines if the compression meets the specified quality threshold.
public bool MeetsQualityThreshold(T maxAccuracyLossPercentage, T minCompressionRatio)
Parameters
maxAccuracyLossPercentageTMaximum acceptable accuracy loss (default: 2%).
minCompressionRatioTMinimum acceptable compression ratio (default: 2x).
Returns
- bool
True if compression meets the quality criteria, false otherwise.
Remarks
For Beginners: This method checks if the compression is "good enough".
It verifies two things:
- Accuracy loss is acceptable (not too much)
- Compression is significant enough (worthwhile)
Example thresholds:
- maxAccuracyLossPercentage = 2% means we accept up to 2% accuracy loss
- minCompressionRatio = 2x means we want at least 50% size reduction
If both conditions are met, the compression is considered successful.
ToString()
Gets a human-readable summary of the compression metrics.
public override string ToString()
Returns
- string
A formatted string containing all metrics.