Table of Contents

Interface IPruningStrategy<T>

Namespace
AiDotNet.Interfaces
Assembly
AiDotNet.dll

Interface for pruning strategies that remove unimportant weights to create sparsity.

public interface IPruningStrategy<T>

Type Parameters

T

Numeric type for weights and gradients.

Remarks

Pruning strategies determine which weights to remove from a neural network to reduce size and computational requirements. This interface supports all data types (Vector, Matrix, Tensor) and multiple sparsity patterns including unstructured, structured, and hardware-optimized formats.

For Beginners: Pruning removes unnecessary connections from neural networks.

Think of it like pruning a tree - you remove branches that don't contribute much:

  • Magnitude pruning: Remove smallest weights
  • Gradient pruning: Remove weights with smallest gradients (learning slowly)
  • Structured pruning: Remove entire neurons/filters (cleaner architecture)
  • Movement pruning: Remove weights that don't change during training
  • Lottery ticket: Find sparse subnetworks that train well from scratch

Sparsity patterns:

  • Unstructured: Random individual weights (flexible but needs sparse libraries)
  • Structured: Entire rows/columns (actual speedup on any hardware)
  • 2:4 Sparsity: 2 zeros per 4 elements (NVIDIA Ampere 2x speedup)
  • N:M Sparsity: N zeros per M elements (customizable)

Pruning can remove 50-99% of weights with minimal accuracy loss!

Properties

IsStructured

Gets whether this is structured pruning (removes entire rows/cols/filters).

bool IsStructured { get; }

Property Value

bool

Name

Gets the name of this pruning strategy.

string Name { get; }

Property Value

string

RequiresGradients

Gets whether this strategy requires gradients.

bool RequiresGradients { get; }

Property Value

bool

SupportedPatterns

Gets supported sparsity patterns.

IReadOnlyList<SparsityPattern> SupportedPatterns { get; }

Property Value

IReadOnlyList<SparsityPattern>

Methods

ApplyPruning(Matrix<T>, IPruningMask<T>)

Applies pruning mask to matrix weights in-place.

void ApplyPruning(Matrix<T> weights, IPruningMask<T> mask)

Parameters

weights Matrix<T>
mask IPruningMask<T>

ApplyPruning(Tensor<T>, IPruningMask<T>)

Applies pruning mask to tensor weights in-place.

void ApplyPruning(Tensor<T> weights, IPruningMask<T> mask)

Parameters

weights Tensor<T>
mask IPruningMask<T>

ApplyPruning(Vector<T>, IPruningMask<T>)

Applies pruning mask to vector weights in-place.

void ApplyPruning(Vector<T> weights, IPruningMask<T> mask)

Parameters

weights Vector<T>
mask IPruningMask<T>

ComputeImportanceScores(Matrix<T>, Matrix<T>?)

Computes importance scores for matrix weights.

Matrix<T> ComputeImportanceScores(Matrix<T> weights, Matrix<T>? gradients = null)

Parameters

weights Matrix<T>

Weight matrix.

gradients Matrix<T>

Gradient matrix (optional, can be null).

Returns

Matrix<T>

Importance score for each weight (higher = more important).

ComputeImportanceScores(Tensor<T>, Tensor<T>?)

Computes importance scores for tensor weights.

Tensor<T> ComputeImportanceScores(Tensor<T> weights, Tensor<T>? gradients = null)

Parameters

weights Tensor<T>

Weight tensor.

gradients Tensor<T>

Gradient tensor (optional, can be null).

Returns

Tensor<T>

Importance score for each weight (higher = more important).

ComputeImportanceScores(Vector<T>, Vector<T>?)

Computes importance scores for vector weights.

Vector<T> ComputeImportanceScores(Vector<T> weights, Vector<T>? gradients = null)

Parameters

weights Vector<T>

Weight vector.

gradients Vector<T>

Gradient vector (optional, can be null).

Returns

Vector<T>

Importance score for each weight (higher = more important).

Create2to4Mask(Tensor<T>)

Creates a 2:4 structured sparsity mask (NVIDIA Ampere compatible).

IPruningMask<T> Create2to4Mask(Tensor<T> importanceScores)

Parameters

importanceScores Tensor<T>

Importance scores.

Returns

IPruningMask<T>

2:4 structured mask (exactly 2 zeros per 4 elements).

CreateMask(Matrix<T>, double)

Creates a pruning mask for matrix weights based on target sparsity.

IPruningMask<T> CreateMask(Matrix<T> importanceScores, double targetSparsity)

Parameters

importanceScores Matrix<T>

Importance scores from ComputeImportanceScores.

targetSparsity double

Target sparsity ratio (0 to 1).

Returns

IPruningMask<T>

Binary mask (1 = keep, 0 = prune).

CreateMask(Tensor<T>, double)

Creates a pruning mask for tensor weights based on target sparsity.

IPruningMask<T> CreateMask(Tensor<T> importanceScores, double targetSparsity)

Parameters

importanceScores Tensor<T>

Importance scores from ComputeImportanceScores.

targetSparsity double

Target sparsity ratio (0 to 1).

Returns

IPruningMask<T>

Binary mask (1 = keep, 0 = prune).

CreateMask(Vector<T>, double)

Creates a pruning mask for vector weights based on target sparsity.

IPruningMask<T> CreateMask(Vector<T> importanceScores, double targetSparsity)

Parameters

importanceScores Vector<T>

Importance scores from ComputeImportanceScores.

targetSparsity double

Target sparsity ratio (0 to 1).

Returns

IPruningMask<T>

Binary mask (1 = keep, 0 = prune).

CreateNtoMMask(Tensor<T>, int, int)

Creates an N:M structured sparsity mask.

IPruningMask<T> CreateNtoMMask(Tensor<T> importanceScores, int n, int m)

Parameters

importanceScores Tensor<T>

Importance scores.

n int

Number of zeros per group.

m int

Group size.

Returns

IPruningMask<T>

N:M structured mask.

ToSparseFormat(Tensor<T>, SparseFormat)

Converts pruned weights to sparse format for efficient storage.

SparseCompressionResult<T> ToSparseFormat(Tensor<T> weights, SparseFormat format)

Parameters

weights Tensor<T>

Pruned weights (containing zeros).

format SparseFormat

Target sparse format.

Returns

SparseCompressionResult<T>

Sparse representation.