Class CompressionAnalyzer<T>
- Namespace
- AiDotNet.ModelCompression
- Assembly
- AiDotNet.dll
Analyzes model weights to determine optimal compression strategies.
public class CompressionAnalyzer<T>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
- Inheritance
-
CompressionAnalyzer<T>
- Inherited Members
Remarks
CompressionAnalyzer examines model weight distributions to recommend the best compression technique and hyperparameters. It analyzes properties like weight sparsity, magnitude distribution, and redundancy to make informed recommendations.
For Beginners: Before compressing a model, it helps to understand its weights.
This analyzer looks at your model's weights and answers questions like:
- Are many weights already close to zero? (Good for pruning)
- Are weights clustered around certain values? (Good for quantization)
- What's the distribution of weight values? (Affects all techniques)
Based on this analysis, it recommends:
- Which compression technique to use
- What settings (hyperparameters) to use
- What compression ratio to expect
This helps you make informed decisions without trial-and-error.
Constructors
CompressionAnalyzer(double, int)
Initializes a new instance of the CompressionAnalyzer class.
public CompressionAnalyzer(double nearZeroThreshold = 0.01, int histogramBins = 256)
Parameters
nearZeroThresholddoubleThreshold for considering a weight as "near zero" (default: 0.01).
histogramBinsintNumber of bins for histogram analysis (default: 256).
Methods
Analyze(Vector<T>, bool)
Analyzes model weights and returns compression recommendations.
public WeightAnalysisResult<T> Analyze(Vector<T> weights, bool isConvolutional = false)
Parameters
weightsVector<T>The model weights to analyze.
isConvolutionalboolWhether the weights are from convolutional layers (default: false).
Returns
- WeightAnalysisResult<T>
Analysis results with compression recommendations.
Remarks
For Beginners: Pass your model weights to get compression recommendations.
The isConvolutional parameter helps optimize recommendations:
- Convolutional layers: typically use lower pruning rates (60-70%)
- Fully-connected layers: can use higher pruning rates (90-95%)
This is because convolutional layers have more structured, important patterns.
GenerateReport(WeightAnalysisResult<T>)
Generates a detailed analysis report.
public string GenerateReport(WeightAnalysisResult<T> result)
Parameters
resultWeightAnalysisResult<T>The analysis result to report on.
Returns
- string
A formatted string containing the analysis report.