Class LocallyConnectedLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Represents a Locally Connected layer which applies different filters to different regions of the input, unlike a convolutional layer which shares filters.
public class LocallyConnectedLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>LocallyConnectedLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
The Locally Connected layer is similar to a convolutional layer in that it applies filters to local regions of the input, but differs in that it uses different filter weights for each spatial location. This increases the number of parameters and the expressiveness of the model, but reduces generalization capabilities. It's useful when the patterns in different regions of the input are inherently different, such as in face recognition where different parts of a face have different characteristics.
For Beginners: This layer is like a specialized convolutional layer where each region gets its own unique filter.
Think of a Locally Connected layer like having specialized detectors for different regions:
- In a regular convolutional layer, the same filter slides across the entire input
- In a locally connected layer, each position has its own unique filter
- This means the layer can learn location-specific features
For example, in face recognition:
- A convolutional layer would use the same detector for eyes, whether looking at the top-left or bottom-right
- A locally connected layer would use different detectors depending on where it's looking
This specialization increases the model's power but:
- Requires more parameters
- May not generalize as well to new examples
- Is more computationally intensive
Constructors
LocallyConnectedLayer(int, int, int, int, int, int, IActivationFunction<T>?)
Initializes a new instance of the LocallyConnectedLayer<T> class with the specified dimensions, kernel parameters, and element-wise activation function.
public LocallyConnectedLayer(int inputHeight, int inputWidth, int inputChannels, int outputChannels, int kernelSize, int stride, IActivationFunction<T>? activationFunction = null)
Parameters
inputHeightintThe height of the input tensor.
inputWidthintThe width of the input tensor.
inputChannelsintThe number of channels in the input tensor.
outputChannelsintThe number of channels in the output tensor.
kernelSizeintThe size of the kernel (filter) in both height and width dimensions.
strideintThe stride (step size) of the kernel when moving across the input.
activationFunctionIActivationFunction<T>The activation function to apply after the locally connected operation. Defaults to ReLU if not specified.
Remarks
This constructor creates a new Locally Connected layer with the specified dimensions, kernel parameters, and element-wise activation function. It initializes the weights and biases and calculates the output dimensions based on the input dimensions, kernel size, and stride.
For Beginners: This creates a new locally connected layer with standard activation function.
When creating this layer, you specify:
- inputHeight, inputWidth: The dimensions of your input data
- inputChannels: How many channels your input data has
- outputChannels: How many different features you want the layer to detect
- kernelSize: The size of each filter window (e.g., 3 for a 3x3 filter)
- stride: How many pixels the filter moves each step
- activationFunction: What function to apply to the output (default is ReLU)
For example, to process 28x28 grayscale images with 16 output features, 3x3 filters, and a stride of 1, you would use: inputHeight=28, inputWidth=28, inputChannels=1, outputChannels=16, kernelSize=3, stride=1.
LocallyConnectedLayer(int, int, int, int, int, int, IVectorActivationFunction<T>?)
Initializes a new instance of the LocallyConnectedLayer<T> class with the specified dimensions, kernel parameters, and vector activation function.
public LocallyConnectedLayer(int inputHeight, int inputWidth, int inputChannels, int outputChannels, int kernelSize, int stride, IVectorActivationFunction<T>? vectorActivationFunction = null)
Parameters
inputHeightintThe height of the input tensor.
inputWidthintThe width of the input tensor.
inputChannelsintThe number of channels in the input tensor.
outputChannelsintThe number of channels in the output tensor.
kernelSizeintThe size of the kernel (filter) in both height and width dimensions.
strideintThe stride (step size) of the kernel when moving across the input.
vectorActivationFunctionIVectorActivationFunction<T>The vector activation function to apply after the locally connected operation. Defaults to ReLU if not specified.
Remarks
This constructor creates a new Locally Connected layer with the specified dimensions, kernel parameters, and vector activation function. Vector activation functions operate on entire vectors rather than individual elements.
For Beginners: This creates a new locally connected layer with an advanced vector-based activation.
Vector activation functions:
- Process entire groups of numbers together, not just one at a time
- Can capture relationships between different features
- May be more powerful for complex patterns
Otherwise, this constructor works just like the standard one, setting up the layer with:
- The specified dimensions and parameters
- Proper calculation of output dimensions
- Initialization of weights and biases
Properties
SupportsJitCompilation
Gets a value indicating whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
truewhen weights are initialized and activation function supports JIT.
Remarks
Locally connected layers support JIT compilation using the LocallyConnectedConv2D operation from TensorOperations. The layer applies different filters to different spatial locations.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
truebecause this layer has trainable parameters (weights and biases).
Remarks
This property indicates whether the layer can be trained through backpropagation. The LocallyConnectedLayer always returns true because it contains trainable weights and biases.
For Beginners: This property tells you if the layer can learn from data.
A value of true means:
- The layer has parameters that can be adjusted during training
- It will improve its performance as it sees more data
- It participates in the learning process
The Locally Connected layer always supports training because it has weights and biases that are learned during training.
Methods
Backward(Tensor<T>)
Performs the backward pass of the locally connected layer.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the layer's output.
Returns
- Tensor<T>
The gradient of the loss with respect to the layer's input.
Remarks
This method implements the backward pass of the locally connected layer, which is used during training to propagate error gradients back through the network. It calculates the gradients for the weights and biases, and returns the gradient with respect to the input for further backpropagation.
For Beginners: This method is used during training to calculate how the layer's input and parameters should change to reduce errors.
During the backward pass:
- The layer receives information about how its output contributed to errors
- It calculates how the weights and biases should change to reduce errors
- It calculates how the input should change, which will be used by earlier layers
This process involves:
- Applying the derivative of the activation function
- Computing gradients for each unique filter
- Computing gradients for biases
- Computing how the input should change
The method will throw an error if you try to run it before performing a forward pass.
Exceptions
- InvalidOperationException
Thrown when Forward has not been called before Backward.
BackwardGpu(IGpuTensor<T>)
Performs the backward pass using GPU-resident tensors.
public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)
Parameters
outputGradientIGpuTensor<T>GPU-resident gradient tensor.
Returns
- IGpuTensor<T>
GPU-resident input gradient tensor.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the locally connected layer's forward pass as a JIT-compilable computation graph.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the locally connected layer output.
Remarks
The locally connected layer computation graph implements: output = activation(LocallyConnectedConv2D(input, weights) + bias)
For Beginners: This creates an optimized version of the locally connected layer. Unlike convolution which shares filters, locally connected layers use unique filters for each position.
Forward(Tensor<T>)
Performs the forward pass of the locally connected layer.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to process. Shape should be [batchSize, inputHeight, inputWidth, inputChannels].
Returns
- Tensor<T>
The output tensor after applying the locally connected operation and activation. Shape will be [batchSize, outputHeight, outputWidth, outputChannels].
Remarks
This method implements the forward pass of the locally connected layer. It applies different filters to each spatial location of the input, followed by adding biases and applying the activation function.
For Beginners: This method processes your data through the locally connected filters.
During the forward pass:
- For each position in the output:
- Apply a unique filter to the corresponding region of the input
- Sum up the results of element-wise multiplications
- Add the bias for the output channel
- Apply the activation function to add non-linearity
This process is similar to a convolution, but instead of re-using the same filter for all positions, each position has its own specialized filter.
ForwardGpu(params IGpuTensor<T>[])
Performs the forward pass using GPU-resident tensors, keeping all data on GPU.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]GPU-resident input tensor [batch, inChannels, inHeight, inWidth] in NCHW format.
Returns
- IGpuTensor<T>
GPU-resident output tensor [batch, outChannels, outHeight, outWidth] in NCHW format.
Remarks
For Beginners: This is the GPU-optimized version of the Forward method. All data stays on the GPU throughout the computation, avoiding expensive CPU-GPU transfers.
GetParameters()
Gets all trainable parameters of the layer as a single vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all trainable parameters.
Remarks
This method retrieves all trainable parameters (weights and biases) and combines them into a single vector. This is useful for optimization algorithms that operate on all parameters at once, or for saving and loading model weights.
For Beginners: This method collects all the learnable values from the layer.
The parameters:
- Are the numbers that the neural network learns during training
- Include all the unique filter weights (which can be very many!) and biases
- Are combined into a single long list (vector)
This is useful for:
- Saving the model to disk
- Loading parameters from a previously trained model
- Advanced optimization techniques that need access to all parameters
For locally connected layers, this vector can be very large due to the unique filters for each spatial location.
ResetState()
Resets the internal state of the layer.
public override void ResetState()
Remarks
This method resets the internal state of the layer, clearing cached values from forward and backward passes. This includes the last input tensor and the weight and bias gradients.
For Beginners: This method clears the layer's memory to start fresh.
When resetting the state:
- The saved input from the last forward pass is cleared
- All gradient information from the last backward pass is cleared
- The layer is ready for new data without being influenced by previous data
This is important for:
- Processing a new, unrelated batch of data
- Preventing information from one batch affecting another
- Starting a new training episode
It helps ensure that each training or prediction batch is processed independently.
SetParameters(Vector<T>)
Sets the trainable parameters of the layer.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>A vector containing all parameters to set.
Remarks
This method sets all the weights and biases of the layer from a single vector of parameters. The vector must have the correct length to match the total number of parameters in the layer.
For Beginners: This method updates all the learnable values in the layer.
When setting parameters:
- The input must be a vector with the correct length
- The values are distributed to all the weights and biases in the correct order
- Throws an error if the input doesn't match the expected number of parameters
This is useful for:
- Loading a previously saved model
- Transferring parameters from another model
- Setting specific parameter values for testing
For locally connected layers, this vector needs to be very large to account for all the unique filters at each spatial location.
Exceptions
- ArgumentException
Thrown when the parameters vector has incorrect length.
UpdateParameters(T)
Updates the parameters of the layer using the calculated gradients.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate to use for the parameter updates.
Remarks
This method updates the weights and biases of the layer based on the gradients calculated during the backward pass. The learning rate controls the size of the parameter updates.
For Beginners: This method updates the layer's internal values during training.
When updating parameters:
- All weights and biases are adjusted to reduce prediction errors
- The learning rate controls how big each update step is
- Smaller learning rates mean slower but more stable learning
- Larger learning rates mean faster but potentially unstable learning
This is how the layer "learns" from data over time, gradually improving its ability to extract useful features from the input.
The method will throw an error if you try to run it before performing a backward pass.
Exceptions
- InvalidOperationException
Thrown when Backward has not been called before UpdateParameters.
UpdateParametersGpu(IGpuOptimizerConfig)
Updates parameters using GPU-based optimizer.
public override void UpdateParametersGpu(IGpuOptimizerConfig config)
Parameters
configIGpuOptimizerConfigGPU optimizer configuration specifying the optimizer type and hyperparameters.