Class SparseLinearLayer<T>

Namespace: AiDotNet.NeuralNetworks.Layers

Assembly: AiDotNet.dll

Represents a fully connected layer with sparse weight matrix for efficient computation.

public class SparseLinearLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations (float or double).

Inheritance: object

LayerBase<T>

SparseLinearLayer<T>

Implements: ILayer<T>

IJitCompilable<T>

IDiagnosticsProvider

IWeightLoadable<T>

IDisposable

Inherited Members: LayerBase<T>.Engine

LayerBase<T>.ScalarActivation

LayerBase<T>.VectorActivation

LayerBase<T>.UsingVectorActivation

LayerBase<T>.NumOps

LayerBase<T>.Random

LayerBase<T>.Parameters

LayerBase<T>.ParameterGradients

LayerBase<T>.InputShape

LayerBase<T>.InputShapes

LayerBase<T>.UpdateInputShape(int[])

LayerBase<T>.OutputShape

LayerBase<T>.IsTrainingMode

LayerBase<T>.InitializationStrategy

LayerBase<T>.IsInitialized

LayerBase<T>.InitializationLock

LayerBase<T>.EnsureInitialized()

LayerBase<T>.UseAutodiff

LayerBase<T>.SetTrainingMode(bool)

LayerBase<T>.GetParameterGradients()

LayerBase<T>.ClearGradients()

LayerBase<T>.GetInputShape()

LayerBase<T>.GetInputShapes()

LayerBase<T>.GetOutputShape()

LayerBase<T>.GetWeights()

LayerBase<T>.GetBiases()

LayerBase<T>.MapActivationToFused()

LayerBase<T>.SupportsGpuExecution

LayerBase<T>.SupportsGpuTraining

LayerBase<T>.CanExecuteOnGpu

LayerBase<T>.CanTrainOnGpu

LayerBase<T>.ForwardGpu(params IGpuTensor<T>[])

LayerBase<T>.BackwardGpu(IGpuTensor<T>)

LayerBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

LayerBase<T>.UploadWeightsToGpu()

LayerBase<T>.DownloadWeightsFromGpu()

LayerBase<T>.ZeroGradientsGpu()

LayerBase<T>.GetActivationTypes()

LayerBase<T>.Forward(params Tensor<T>[])

LayerBase<T>.ApplyActivation(Tensor<T>)

LayerBase<T>.ApplyActivation(Vector<T>)

LayerBase<T>.ActivateTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ActivateTensor(IVectorActivationFunction<T>, Tensor<T>)

LayerBase<T>.CalculateInputShape(int, int, int)

LayerBase<T>.CalculateOutputShape(int, int, int)

LayerBase<T>.Clone()

LayerBase<T>.DerivativeTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ApplyActivationDerivative(T, T)

LayerBase<T>.ApplyActivationDerivative(Tensor<T>, Tensor<T>)

LayerBase<T>.ComputeActivationJacobian(Vector<T>)

LayerBase<T>.ApplyActivationDerivative(Vector<T>, Vector<T>)

LayerBase<T>.UpdateParameters(Vector<T>)

LayerBase<T>.Serialize(BinaryWriter)

LayerBase<T>.Deserialize(BinaryReader)

LayerBase<T>.GetDiagnostics()

LayerBase<T>.ApplyActivationToGraph(ComputationNode<T>)

LayerBase<T>.CanActivationBeJitted()

LayerBase<T>.RegisterTrainableParameter(Tensor<T>, PersistentTensorRole)

LayerBase<T>.InvalidateTrainableParameter(Tensor<T>)

LayerBase<T>.HasGpuActivation()

LayerBase<T>.ApplyActivationForwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.ApplyActivationBackwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.GetFusedActivationType()

LayerBase<T>.ApplyGpuActivation(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int, FusedActivationType)

LayerBase<T>.ApplyGpuActivationBackward(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int, FusedActivationType, float)

LayerBase<T>.Dispose()

LayerBase<T>.Dispose(bool)

LayerBase<T>.WeightParameterName

LayerBase<T>.BiasParameterName

LayerBase<T>.SetWeights(Tensor<T>)

LayerBase<T>.SetBiases(Tensor<T>)

LayerBase<T>.GetParameterNames()

LayerBase<T>.TryGetParameter(string, out Tensor<T>)

LayerBase<T>.SetParameter(string, Tensor<T>)

LayerBase<T>.GetParameterShape(string)

LayerBase<T>.NamedParameterCount

LayerBase<T>.ValidateWeights(IEnumerable<string>, Func<string, string>)

LayerBase<T>.LoadWeights(Dictionary<string, Tensor<T>>, Func<string, string>, bool)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

A sparse linear layer is similar to a dense layer but uses sparse weight storage. This is efficient when most weights are zero (or can be pruned), reducing both memory usage and computation time.

For Beginners: This layer works like a regular dense layer, but uses sparse matrices to store weights more efficiently.

Benefits of sparse layers:

Much less memory for large layers with few active connections
Faster computation (only non-zero weights are used)
Natural for network pruning and compression

Use cases:

Graph neural networks (sparse adjacency)
Recommender systems (sparse user-item matrices)
Pruned neural networks
Very large embedding layers

Constructors

SparseLinearLayer(int, int, double, IActivationFunction<T>?)

Initializes a new instance of the SparseLinearLayer.

public SparseLinearLayer(int inputFeatures, int outputFeatures, double sparsity = 0.9, IActivationFunction<T>? activationFunction = null)

Parameters

inputFeatures int: Number of input features.
outputFeatures int: Number of output features.
sparsity double: Fraction of weights to be zero (0.0 to 1.0).
activationFunction IActivationFunction<T>: Optional activation function.

Properties

InputFeatures

Gets the number of input features.

public int InputFeatures { get; }

Property Value

int

OutputFeatures

Gets the number of output features.

public int OutputFeatures { get; }

Property Value

int

ParameterCount

Gets the total number of trainable parameters.

public override int ParameterCount { get; }

Property Value

int

Remarks

For sparse layers, this returns the number of non-zero weights plus biases.

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

Remarks

JIT compilation is supported by converting sparse weights to dense format at export time. This enables JIT compilation while preserving correct functionality, though the sparse memory benefits are not retained in the compiled graph. Returns true only if the activation function also supports JIT compilation.

SupportsTraining

Gets whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

Methods

Backward(Tensor<T>)

Performs the backward pass through the layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: Gradient from the next layer.

Returns

Tensor<T>: Gradient to pass to the previous layer.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer's forward pass as a JIT-compilable computation graph.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>: List to populate with input computation nodes.

Returns

ComputationNode<T>: The output computation node representing the linear transformation.

Remarks

This method converts sparse weights to dense format at export time to enable JIT compilation. The resulting computation graph performs a standard dense matrix multiplication: output = input * W^T + bias

While this approach loses the memory benefits of sparse storage during inference, it ensures correct functionality and enables JIT optimization of the compiled graph.

Forward(Tensor<T>)

Performs the forward pass through the layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: Input tensor with shape [inputFeatures] or [batch, inputFeatures].

Returns

Tensor<T>: Output tensor with shape [outputFeatures] or [batch, outputFeatures].

GetParameters()

Gets all trainable parameters as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>: A vector containing all non-zero weights and biases.

ResetState()

Resets the internal state of the layer.

public override void ResetState()

SetParameters(Vector<T>)

Sets all trainable parameters from a single vector.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: A vector containing all parameters to set.

UpdateParameters(T)

Updates the layer's parameters using the calculated gradients. Maintains sparsity pattern during updates.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T: The learning rate to use for the update.

Table of Contents

Class SparseLinearLayer<T>

Type Parameters

Remarks

Constructors

SparseLinearLayer(int, int, double, IActivationFunction<T>?)

Parameters

Properties

InputFeatures

Property Value

OutputFeatures

Property Value

ParameterCount

Property Value

Remarks

SupportsJitCompilation

Property Value

Remarks

SupportsTraining

Property Value

Methods

Backward(Tensor<T>)

Parameters

Returns

ExportComputationGraph(List<ComputationNode<T>>)

Parameters

Returns

Remarks

Forward(Tensor<T>)

Parameters

Returns

GetParameters()

Returns

ResetState()

SetParameters(Vector<T>)

Parameters

UpdateParameters(T)

Parameters