Class LocallyConnectedLayer<T>

Namespace: AiDotNet.NeuralNetworks.Layers

Assembly: AiDotNet.dll

Represents a Locally Connected layer which applies different filters to different regions of the input, unlike a convolutional layer which shares filters.

public class LocallyConnectedLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

LayerBase<T>

LocallyConnectedLayer<T>

Implements: ILayer<T>

IJitCompilable<T>

IDiagnosticsProvider

IWeightLoadable<T>

IDisposable

Inherited Members: LayerBase<T>.Engine

LayerBase<T>.ScalarActivation

LayerBase<T>.VectorActivation

LayerBase<T>.UsingVectorActivation

LayerBase<T>.NumOps

LayerBase<T>.Random

LayerBase<T>.Parameters

LayerBase<T>.ParameterGradients

LayerBase<T>.InputShape

LayerBase<T>.InputShapes

LayerBase<T>.UpdateInputShape(int[])

LayerBase<T>.OutputShape

LayerBase<T>.IsTrainingMode

LayerBase<T>.InitializationStrategy

LayerBase<T>.IsInitialized

LayerBase<T>.InitializationLock

LayerBase<T>.EnsureInitialized()

LayerBase<T>.UseAutodiff

LayerBase<T>.SetTrainingMode(bool)

LayerBase<T>.GetParameterGradients()

LayerBase<T>.ClearGradients()

LayerBase<T>.GetInputShape()

LayerBase<T>.GetInputShapes()

LayerBase<T>.GetOutputShape()

LayerBase<T>.GetWeights()

LayerBase<T>.GetBiases()

LayerBase<T>.MapActivationToFused()

LayerBase<T>.SupportsGpuExecution

LayerBase<T>.SupportsGpuTraining

LayerBase<T>.CanExecuteOnGpu

LayerBase<T>.CanTrainOnGpu

LayerBase<T>.UploadWeightsToGpu()

LayerBase<T>.DownloadWeightsFromGpu()

LayerBase<T>.ZeroGradientsGpu()

LayerBase<T>.GetActivationTypes()

LayerBase<T>.Forward(params Tensor<T>[])

LayerBase<T>.ApplyActivation(Tensor<T>)

LayerBase<T>.ApplyActivation(Vector<T>)

LayerBase<T>.ActivateTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ActivateTensor(IVectorActivationFunction<T>, Tensor<T>)

LayerBase<T>.CalculateInputShape(int, int, int)

LayerBase<T>.CalculateOutputShape(int, int, int)

LayerBase<T>.Clone()

LayerBase<T>.DerivativeTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ApplyActivationDerivative(T, T)

LayerBase<T>.ApplyActivationDerivative(Tensor<T>, Tensor<T>)

LayerBase<T>.ComputeActivationJacobian(Vector<T>)

LayerBase<T>.ApplyActivationDerivative(Vector<T>, Vector<T>)

LayerBase<T>.UpdateParameters(Vector<T>)

LayerBase<T>.ParameterCount

LayerBase<T>.Serialize(BinaryWriter)

LayerBase<T>.Deserialize(BinaryReader)

LayerBase<T>.GetDiagnostics()

LayerBase<T>.ApplyActivationToGraph(ComputationNode<T>)

LayerBase<T>.CanActivationBeJitted()

LayerBase<T>.RegisterTrainableParameter(Tensor<T>, PersistentTensorRole)

LayerBase<T>.InvalidateTrainableParameter(Tensor<T>)

LayerBase<T>.HasGpuActivation()

LayerBase<T>.ApplyActivationForwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.ApplyActivationBackwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.GetFusedActivationType()

LayerBase<T>.ApplyGpuActivation(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int, FusedActivationType)

LayerBase<T>.ApplyGpuActivationBackward(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int, FusedActivationType, float)

LayerBase<T>.Dispose()

LayerBase<T>.Dispose(bool)

LayerBase<T>.WeightParameterName

LayerBase<T>.BiasParameterName

LayerBase<T>.SetWeights(Tensor<T>)

LayerBase<T>.SetBiases(Tensor<T>)

LayerBase<T>.GetParameterNames()

LayerBase<T>.TryGetParameter(string, out Tensor<T>)

LayerBase<T>.SetParameter(string, Tensor<T>)

LayerBase<T>.GetParameterShape(string)

LayerBase<T>.NamedParameterCount

LayerBase<T>.ValidateWeights(IEnumerable<string>, Func<string, string>)

LayerBase<T>.LoadWeights(Dictionary<string, Tensor<T>>, Func<string, string>, bool)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

The Locally Connected layer is similar to a convolutional layer in that it applies filters to local regions of the input, but differs in that it uses different filter weights for each spatial location. This increases the number of parameters and the expressiveness of the model, but reduces generalization capabilities. It's useful when the patterns in different regions of the input are inherently different, such as in face recognition where different parts of a face have different characteristics.

For Beginners: This layer is like a specialized convolutional layer where each region gets its own unique filter.

Think of a Locally Connected layer like having specialized detectors for different regions:

In a regular convolutional layer, the same filter slides across the entire input
In a locally connected layer, each position has its own unique filter
This means the layer can learn location-specific features

For example, in face recognition:

A convolutional layer would use the same detector for eyes, whether looking at the top-left or bottom-right
A locally connected layer would use different detectors depending on where it's looking

This specialization increases the model's power but:

Requires more parameters
May not generalize as well to new examples
Is more computationally intensive

Constructors

LocallyConnectedLayer(int, int, int, int, int, int, IActivationFunction<T>?)

Initializes a new instance of the LocallyConnectedLayer<T> class with the specified dimensions, kernel parameters, and element-wise activation function.

public LocallyConnectedLayer(int inputHeight, int inputWidth, int inputChannels, int outputChannels, int kernelSize, int stride, IActivationFunction<T>? activationFunction = null)

Parameters

inputHeight int: The height of the input tensor.
inputWidth int: The width of the input tensor.
inputChannels int: The number of channels in the input tensor.
outputChannels int: The number of channels in the output tensor.
kernelSize int: The size of the kernel (filter) in both height and width dimensions.
stride int: The stride (step size) of the kernel when moving across the input.
activationFunction IActivationFunction<T>: The activation function to apply after the locally connected operation. Defaults to ReLU if not specified.

Remarks

This constructor creates a new Locally Connected layer with the specified dimensions, kernel parameters, and element-wise activation function. It initializes the weights and biases and calculates the output dimensions based on the input dimensions, kernel size, and stride.

For Beginners: This creates a new locally connected layer with standard activation function.

When creating this layer, you specify:

inputHeight, inputWidth: The dimensions of your input data
inputChannels: How many channels your input data has
outputChannels: How many different features you want the layer to detect
kernelSize: The size of each filter window (e.g., 3 for a 3x3 filter)
stride: How many pixels the filter moves each step
activationFunction: What function to apply to the output (default is ReLU)

For example, to process 28x28 grayscale images with 16 output features, 3x3 filters, and a stride of 1, you would use: inputHeight=28, inputWidth=28, inputChannels=1, outputChannels=16, kernelSize=3, stride=1.

LocallyConnectedLayer(int, int, int, int, int, int, IVectorActivationFunction<T>?)

Initializes a new instance of the LocallyConnectedLayer<T> class with the specified dimensions, kernel parameters, and vector activation function.

public LocallyConnectedLayer(int inputHeight, int inputWidth, int inputChannels, int outputChannels, int kernelSize, int stride, IVectorActivationFunction<T>? vectorActivationFunction = null)

Parameters

inputHeight int: The height of the input tensor.
inputWidth int: The width of the input tensor.
inputChannels int: The number of channels in the input tensor.
outputChannels int: The number of channels in the output tensor.
kernelSize int: The size of the kernel (filter) in both height and width dimensions.
stride int: The stride (step size) of the kernel when moving across the input.
vectorActivationFunction IVectorActivationFunction<T>: The vector activation function to apply after the locally connected operation. Defaults to ReLU if not specified.

Remarks

This constructor creates a new Locally Connected layer with the specified dimensions, kernel parameters, and vector activation function. Vector activation functions operate on entire vectors rather than individual elements.

For Beginners: This creates a new locally connected layer with an advanced vector-based activation.

Vector activation functions:

Process entire groups of numbers together, not just one at a time
Can capture relationships between different features
May be more powerful for complex patterns

Otherwise, this constructor works just like the standard one, setting up the layer with:

The specified dimensions and parameters
Proper calculation of output dimensions
Initialization of weights and biases

Properties

SupportsJitCompilation

Gets a value indicating whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool: true when weights are initialized and activation function supports JIT.

Remarks

Locally connected layers support JIT compilation using the LocallyConnectedConv2D operation from TensorOperations. The layer applies different filters to different spatial locations.

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool: true because this layer has trainable parameters (weights and biases).

Remarks

This property indicates whether the layer can be trained through backpropagation. The LocallyConnectedLayer always returns true because it contains trainable weights and biases.

For Beginners: This property tells you if the layer can learn from data.

A value of true means:

The layer has parameters that can be adjusted during training
It will improve its performance as it sees more data
It participates in the learning process

The Locally Connected layer always supports training because it has weights and biases that are learned during training.

Methods

Backward(Tensor<T>)

Performs the backward pass of the locally connected layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: The gradient of the loss with respect to the layer's output.

Returns

Tensor<T>: The gradient of the loss with respect to the layer's input.

Remarks

This method implements the backward pass of the locally connected layer, which is used during training to propagate error gradients back through the network. It calculates the gradients for the weights and biases, and returns the gradient with respect to the input for further backpropagation.

For Beginners: This method is used during training to calculate how the layer's input and parameters should change to reduce errors.

During the backward pass:

The layer receives information about how its output contributed to errors
It calculates how the weights and biases should change to reduce errors
It calculates how the input should change, which will be used by earlier layers

This process involves:

Applying the derivative of the activation function
Computing gradients for each unique filter
Computing gradients for biases
Computing how the input should change

The method will throw an error if you try to run it before performing a forward pass.

Exceptions

InvalidOperationException: Thrown when Forward has not been called before Backward.

BackwardGpu(IGpuTensor<T>)

Performs the backward pass using GPU-resident tensors.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>: GPU-resident gradient tensor.

Returns

IGpuTensor<T>: GPU-resident input gradient tensor.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the locally connected layer's forward pass as a JIT-compilable computation graph.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>: List to populate with input computation nodes.

Returns

ComputationNode<T>: The output computation node representing the locally connected layer output.

Remarks

The locally connected layer computation graph implements: output = activation(LocallyConnectedConv2D(input, weights) + bias)

For Beginners: This creates an optimized version of the locally connected layer. Unlike convolution which shares filters, locally connected layers use unique filters for each position.

Forward(Tensor<T>)

Performs the forward pass of the locally connected layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor to process. Shape should be [batchSize, inputHeight, inputWidth, inputChannels].

Returns

Tensor<T>: The output tensor after applying the locally connected operation and activation. Shape will be [batchSize, outputHeight, outputWidth, outputChannels].

Remarks

This method implements the forward pass of the locally connected layer. It applies different filters to each spatial location of the input, followed by adding biases and applying the activation function.

For Beginners: This method processes your data through the locally connected filters.

During the forward pass:

For each position in the output:
- Apply a unique filter to the corresponding region of the input
- Sum up the results of element-wise multiplications
- Add the bias for the output channel
Apply the activation function to add non-linearity

This process is similar to a convolution, but instead of re-using the same filter for all positions, each position has its own specialized filter.

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass using GPU-resident tensors, keeping all data on GPU.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]: GPU-resident input tensor [batch, inChannels, inHeight, inWidth] in NCHW format.

Returns

IGpuTensor<T>: GPU-resident output tensor [batch, outChannels, outHeight, outWidth] in NCHW format.

Remarks

For Beginners: This is the GPU-optimized version of the Forward method. All data stays on the GPU throughout the computation, avoiding expensive CPU-GPU transfers.

GetParameters()

Gets all trainable parameters of the layer as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>: A vector containing all trainable parameters.

Remarks

This method retrieves all trainable parameters (weights and biases) and combines them into a single vector. This is useful for optimization algorithms that operate on all parameters at once, or for saving and loading model weights.

For Beginners: This method collects all the learnable values from the layer.

The parameters:

Are the numbers that the neural network learns during training
Include all the unique filter weights (which can be very many!) and biases
Are combined into a single long list (vector)

This is useful for:

Saving the model to disk
Loading parameters from a previously trained model
Advanced optimization techniques that need access to all parameters

For locally connected layers, this vector can be very large due to the unique filters for each spatial location.

ResetState()

Resets the internal state of the layer.

public override void ResetState()

Remarks

This method resets the internal state of the layer, clearing cached values from forward and backward passes. This includes the last input tensor and the weight and bias gradients.

For Beginners: This method clears the layer's memory to start fresh.

When resetting the state:

The saved input from the last forward pass is cleared
All gradient information from the last backward pass is cleared
The layer is ready for new data without being influenced by previous data

This is important for:

Processing a new, unrelated batch of data
Preventing information from one batch affecting another
Starting a new training episode

It helps ensure that each training or prediction batch is processed independently.

SetParameters(Vector<T>)

Sets the trainable parameters of the layer.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: A vector containing all parameters to set.

Remarks

This method sets all the weights and biases of the layer from a single vector of parameters. The vector must have the correct length to match the total number of parameters in the layer.

For Beginners: This method updates all the learnable values in the layer.

When setting parameters:

The input must be a vector with the correct length
The values are distributed to all the weights and biases in the correct order
Throws an error if the input doesn't match the expected number of parameters

This is useful for:

Loading a previously saved model
Transferring parameters from another model
Setting specific parameter values for testing

For locally connected layers, this vector needs to be very large to account for all the unique filters at each spatial location.

Exceptions

ArgumentException: Thrown when the parameters vector has incorrect length.

UpdateParameters(T)

Updates the parameters of the layer using the calculated gradients.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T: The learning rate to use for the parameter updates.

Remarks

This method updates the weights and biases of the layer based on the gradients calculated during the backward pass. The learning rate controls the size of the parameter updates.

For Beginners: This method updates the layer's internal values during training.

When updating parameters:

All weights and biases are adjusted to reduce prediction errors
The learning rate controls how big each update step is
Smaller learning rates mean slower but more stable learning
Larger learning rates mean faster but potentially unstable learning

This is how the layer "learns" from data over time, gradually improving its ability to extract useful features from the input.

The method will throw an error if you try to run it before performing a backward pass.

Exceptions

InvalidOperationException: Thrown when Backward has not been called before UpdateParameters.

UpdateParametersGpu(IGpuOptimizerConfig)

Updates parameters using GPU-based optimizer.

public override void UpdateParametersGpu(IGpuOptimizerConfig config)

Parameters

config IGpuOptimizerConfig: GPU optimizer configuration specifying the optimizer type and hyperparameters.

Table of Contents

Class LocallyConnectedLayer<T>

Type Parameters

Remarks

Constructors

LocallyConnectedLayer(int, int, int, int, int, int, IActivationFunction<T>?)

Parameters

Remarks

LocallyConnectedLayer(int, int, int, int, int, int, IVectorActivationFunction<T>?)

Parameters

Remarks

Properties

SupportsJitCompilation

Property Value

Remarks

SupportsTraining

Property Value

Remarks

Methods

Backward(Tensor<T>)

Parameters

Returns

Remarks

Exceptions

BackwardGpu(IGpuTensor<T>)

Parameters

Returns

ExportComputationGraph(List<ComputationNode<T>>)

Parameters

Returns

Remarks

Forward(Tensor<T>)

Parameters

Returns

Remarks

ForwardGpu(params IGpuTensor<T>[])

Parameters

Returns

Remarks

GetParameters()

Returns

Remarks

ResetState()

Remarks

SetParameters(Vector<T>)

Parameters

Remarks

Exceptions

UpdateParameters(T)

Parameters

Remarks

Exceptions

UpdateParametersGpu(IGpuOptimizerConfig)

Parameters