Interface IPruningStrategy<T>
- Namespace
- AiDotNet.Interfaces
- Assembly
- AiDotNet.dll
Interface for pruning strategies that remove unimportant weights to create sparsity.
public interface IPruningStrategy<T>
Type Parameters
TNumeric type for weights and gradients.
Remarks
Pruning strategies determine which weights to remove from a neural network to reduce size and computational requirements. This interface supports all data types (Vector, Matrix, Tensor) and multiple sparsity patterns including unstructured, structured, and hardware-optimized formats.
For Beginners: Pruning removes unnecessary connections from neural networks.
Think of it like pruning a tree - you remove branches that don't contribute much:
- Magnitude pruning: Remove smallest weights
- Gradient pruning: Remove weights with smallest gradients (learning slowly)
- Structured pruning: Remove entire neurons/filters (cleaner architecture)
- Movement pruning: Remove weights that don't change during training
- Lottery ticket: Find sparse subnetworks that train well from scratch
Sparsity patterns:
- Unstructured: Random individual weights (flexible but needs sparse libraries)
- Structured: Entire rows/columns (actual speedup on any hardware)
- 2:4 Sparsity: 2 zeros per 4 elements (NVIDIA Ampere 2x speedup)
- N:M Sparsity: N zeros per M elements (customizable)
Pruning can remove 50-99% of weights with minimal accuracy loss!
Properties
IsStructured
Gets whether this is structured pruning (removes entire rows/cols/filters).
bool IsStructured { get; }
Property Value
Name
Gets the name of this pruning strategy.
string Name { get; }
Property Value
RequiresGradients
Gets whether this strategy requires gradients.
bool RequiresGradients { get; }
Property Value
SupportedPatterns
Gets supported sparsity patterns.
IReadOnlyList<SparsityPattern> SupportedPatterns { get; }
Property Value
Methods
ApplyPruning(Matrix<T>, IPruningMask<T>)
Applies pruning mask to matrix weights in-place.
void ApplyPruning(Matrix<T> weights, IPruningMask<T> mask)
Parameters
weightsMatrix<T>maskIPruningMask<T>
ApplyPruning(Tensor<T>, IPruningMask<T>)
Applies pruning mask to tensor weights in-place.
void ApplyPruning(Tensor<T> weights, IPruningMask<T> mask)
Parameters
weightsTensor<T>maskIPruningMask<T>
ApplyPruning(Vector<T>, IPruningMask<T>)
Applies pruning mask to vector weights in-place.
void ApplyPruning(Vector<T> weights, IPruningMask<T> mask)
Parameters
weightsVector<T>maskIPruningMask<T>
ComputeImportanceScores(Matrix<T>, Matrix<T>?)
Computes importance scores for matrix weights.
Matrix<T> ComputeImportanceScores(Matrix<T> weights, Matrix<T>? gradients = null)
Parameters
weightsMatrix<T>Weight matrix.
gradientsMatrix<T>Gradient matrix (optional, can be null).
Returns
- Matrix<T>
Importance score for each weight (higher = more important).
ComputeImportanceScores(Tensor<T>, Tensor<T>?)
Computes importance scores for tensor weights.
Tensor<T> ComputeImportanceScores(Tensor<T> weights, Tensor<T>? gradients = null)
Parameters
weightsTensor<T>Weight tensor.
gradientsTensor<T>Gradient tensor (optional, can be null).
Returns
- Tensor<T>
Importance score for each weight (higher = more important).
ComputeImportanceScores(Vector<T>, Vector<T>?)
Computes importance scores for vector weights.
Vector<T> ComputeImportanceScores(Vector<T> weights, Vector<T>? gradients = null)
Parameters
weightsVector<T>Weight vector.
gradientsVector<T>Gradient vector (optional, can be null).
Returns
- Vector<T>
Importance score for each weight (higher = more important).
Create2to4Mask(Tensor<T>)
Creates a 2:4 structured sparsity mask (NVIDIA Ampere compatible).
IPruningMask<T> Create2to4Mask(Tensor<T> importanceScores)
Parameters
importanceScoresTensor<T>Importance scores.
Returns
- IPruningMask<T>
2:4 structured mask (exactly 2 zeros per 4 elements).
CreateMask(Matrix<T>, double)
Creates a pruning mask for matrix weights based on target sparsity.
IPruningMask<T> CreateMask(Matrix<T> importanceScores, double targetSparsity)
Parameters
importanceScoresMatrix<T>Importance scores from ComputeImportanceScores.
targetSparsitydoubleTarget sparsity ratio (0 to 1).
Returns
- IPruningMask<T>
Binary mask (1 = keep, 0 = prune).
CreateMask(Tensor<T>, double)
Creates a pruning mask for tensor weights based on target sparsity.
IPruningMask<T> CreateMask(Tensor<T> importanceScores, double targetSparsity)
Parameters
importanceScoresTensor<T>Importance scores from ComputeImportanceScores.
targetSparsitydoubleTarget sparsity ratio (0 to 1).
Returns
- IPruningMask<T>
Binary mask (1 = keep, 0 = prune).
CreateMask(Vector<T>, double)
Creates a pruning mask for vector weights based on target sparsity.
IPruningMask<T> CreateMask(Vector<T> importanceScores, double targetSparsity)
Parameters
importanceScoresVector<T>Importance scores from ComputeImportanceScores.
targetSparsitydoubleTarget sparsity ratio (0 to 1).
Returns
- IPruningMask<T>
Binary mask (1 = keep, 0 = prune).
CreateNtoMMask(Tensor<T>, int, int)
Creates an N:M structured sparsity mask.
IPruningMask<T> CreateNtoMMask(Tensor<T> importanceScores, int n, int m)
Parameters
Returns
- IPruningMask<T>
N:M structured mask.
ToSparseFormat(Tensor<T>, SparseFormat)
Converts pruned weights to sparse format for efficient storage.
SparseCompressionResult<T> ToSparseFormat(Tensor<T> weights, SparseFormat format)
Parameters
weightsTensor<T>Pruned weights (containing zeros).
formatSparseFormatTarget sparse format.
Returns
- SparseCompressionResult<T>
Sparse representation.