Class PrimaryCapsuleLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Represents a primary capsule layer for capsule networks.
public class PrimaryCapsuleLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>PrimaryCapsuleLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
The PrimaryCapsuleLayer is the first layer in a capsule network that transforms traditional scalar feature maps into capsule vectors. It performs a convolution operation followed by reshaping the output into capsules. Each capsule represents a group of neurons that encodes both the presence and properties of a particular entity. This layer serves as a bridge between standard convolutional layers and higher-level capsule layers.
For Beginners: This layer is the first step in creating a capsule network.
In traditional neural networks, each neuron outputs a single number indicating the presence of a feature. In capsule networks, neurons are grouped into "capsules" where each capsule outputs a vector:
- The length of the vector represents the presence of an entity
- The orientation of the vector represents properties of that entity
Think of it like this:
- Standard neurons: "I see a nose with 90% confidence"
- Capsule neurons: "I see a nose with 90% confidence, and it's pointing 30° to the left, it's 2cm long, it has a slightly curved shape..."
The primary capsule layer converts traditional feature maps (from convolutional layers) into these vector-based capsules that can capture more detailed information about the entities detected.
This approach helps the network understand spatial relationships and maintain information about pose, orientation, and other properties that are typically lost in traditional networks.
Constructors
PrimaryCapsuleLayer(int, int, int, int, int, IActivationFunction<T>?)
Initializes a new instance of the PrimaryCapsuleLayer<T> class with the specified parameters and a scalar activation function.
public PrimaryCapsuleLayer(int inputChannels, int capsuleChannels, int capsuleDimension, int kernelSize, int stride, IActivationFunction<T>? scalarActivation = null)
Parameters
inputChannelsintThe number of input channels.
capsuleChannelsintThe number of capsule channels.
capsuleDimensionintThe dimension of each capsule.
kernelSizeintThe size of the convolutional kernel.
strideintThe stride of the convolution operation.
scalarActivationIActivationFunction<T>The activation function to apply after processing. Defaults to Squash if not specified.
Remarks
This constructor creates a PrimaryCapsuleLayer with the specified parameters. It initializes the convolution weights and biases and sets up the layer to transform input feature maps into primary capsules.
For Beginners: This constructor sets up the layer with the necessary parameters.
When creating a PrimaryCapsuleLayer, you need to specify:
- inputChannels: How many channels your input has (e.g., 3 for RGB images, or more if from a conv layer)
- capsuleChannels: How many different types of capsules to create
- capsuleDimension: How many values in each capsule's output vector
- kernelSize: The size of the area examined by the convolution (e.g., 3 for a 3×3 kernel)
- stride: How far to move the kernel each step
- scalarActivation: The function applied to each scalar value (defaults to Squash)
For example, if you set capsuleChannels=8 and capsuleDimension=16, you'll have 8 different types of capsules, each outputting a 16-dimensional vector.
The default Squash activation function is specifically designed for capsule networks. It ensures that the length of each capsule's output vector is between 0 and 1, which is ideal for representing the probability of an entity being present.
PrimaryCapsuleLayer(int, int, int, int, int, IVectorActivationFunction<T>?)
Initializes a new instance of the PrimaryCapsuleLayer<T> class with the specified parameters and a vector activation function.
public PrimaryCapsuleLayer(int inputChannels, int capsuleChannels, int capsuleDimension, int kernelSize, int stride, IVectorActivationFunction<T>? vectorActivation = null)
Parameters
inputChannelsintThe number of input channels.
capsuleChannelsintThe number of capsule channels.
capsuleDimensionintThe dimension of each capsule.
kernelSizeintThe size of the convolutional kernel.
strideintThe stride of the convolution operation.
vectorActivationIVectorActivationFunction<T>The vector activation function to apply after processing. Defaults to Squash if not specified.
Remarks
This constructor creates a PrimaryCapsuleLayer with the specified parameters and a vector activation function. A vector activation function operates on entire capsule vectors rather than individual elements.
For Beginners: This constructor is similar to the other one, but uses a vector-based activation function.
A vector activation function:
- Operates on entire capsule vectors at once, rather than one value at a time
- Can better preserve the relationship between values in a capsule
- The default Squash function ensures vector lengths are between 0 and 1
The Squash function is specifically designed for capsule networks. It scales vectors non-linearly so that small vectors shrink to nearly zero length, while large vectors shrink to have a length slightly below 1, preserving their direction.
Properties
SupportsGpuExecution
Gets whether this layer has a GPU execution implementation for inference.
protected override bool SupportsGpuExecution { get; }
Property Value
Remarks
Override this to return true when the layer implements ForwardGpu(params IGpuTensor<T>[]). The actual CanExecuteOnGpu property combines this with engine availability.
For Beginners: This flag indicates if the layer has GPU code for the forward pass. Set this to true in derived classes that implement ForwardGpu.
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if the layer can be JIT compiled, false otherwise.
Remarks
This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.
For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.
Layers should return false if they:
- Have not yet implemented a working ExportComputationGraph()
- Use dynamic operations that change based on input data
- Are too simple to benefit from JIT compilation
When false, the layer will use the standard Forward() method instead.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
Always
truebecause the PrimaryCapsuleLayer has trainable parameters.
Remarks
This property indicates that PrimaryCapsuleLayer can be trained through backpropagation. The layer has trainable parameters (convolution weights and biases) that are updated during training to optimize the capsule transformation process.
For Beginners: This property tells you that this layer can learn from data.
A value of true means:
- The layer has internal values (weights and biases) that change during training
- It will improve its performance as it sees more data
- It learns to extract better capsule representations from the input
During training, the layer learns to transform input features into capsule vectors that best represent the entities in the data.
Methods
Backward(Tensor<T>)
Performs the backward pass of the primary capsule layer.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the layer's output.
Returns
- Tensor<T>
The gradient of the loss with respect to the layer's input.
Remarks
This method implements the backward pass of the primary capsule layer, which is used during training to propagate error gradients back through the network. It computes the gradients of the convolution weights and biases, as well as the gradient with respect to the input tensor. The computed weight and bias gradients are stored for later use in the parameter update step.
For Beginners: This method calculates how all parameters should change to reduce errors.
During the backward pass:
- The layer receives gradients indicating how the output capsules should change
- It calculates how each weight, bias, and input value should change
- These gradients are used later to update the parameters during training
This process involves:
- Applying the derivative of the activation function
- For each location in the output:
- Extracting the corresponding input patch
- Computing the gradients for weights and biases
- Computing the gradients for the input
- Aggregating all the gradients
This allows the layer to learn how to better transform input features into meaningful capsule representations.
Exceptions
- InvalidOperationException
Thrown when backward is called before forward.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the layer's computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the layer's operation.
Remarks
This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.
For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.
To support JIT compilation, a layer must:
- Implement this method to export its computation graph
- Set SupportsJitCompilation to true
- Use ComputationNode and TensorOperations to build the graph
All layers are required to implement this method, even if they set SupportsJitCompilation = false.
Forward(Tensor<T>)
Performs the forward pass of the primary capsule layer.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to process.
Returns
- Tensor<T>
The output tensor containing capsule vectors.
Remarks
This method implements the forward pass of the primary capsule layer. It performs a convolution operation on the input tensor, reshapes the result into capsule vectors, and applies the activation function to produce the final output. The input and output tensors are cached for use during the backward pass.
For Beginners: This method transforms the input features into capsule vectors.
During the forward pass:
- The method applies a convolution operation to the input (similar to a standard convolutional layer)
- It reshapes the result into groups of vectors (the capsules)
- It applies the activation function (typically Squash) to each capsule vector
The output is a set of capsule vectors where:
- Each capsule vector's length represents the probability of detecting an entity
- The orientation of the vector represents properties of the detected entity
This is the key difference from traditional neural networks - instead of just detecting if something is present, the capsules also capture information about the properties of what they detect.
ForwardGpu(params IGpuTensor<T>[])
Performs GPU-accelerated forward pass through the primary capsule layer.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]GPU-resident input tensors.
Returns
- IGpuTensor<T>
GPU-resident output tensor after capsule transformation.
Remarks
This method implements the forward pass using GPU-resident operations where possible. The convolution and reshape operations are kept on GPU for efficiency.
GetParameters()
Gets all trainable parameters from the primary capsule layer as a single vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all trainable parameters.
Remarks
This method retrieves all trainable parameters from the layer as a single vector. It concatenates the convolution weights and biases into a single vector. This is useful for optimization algorithms that operate on all parameters at once, or for saving and loading model weights.
For Beginners: This method collects all the learnable values in the layer.
The parameters:
- Are the numbers that the neural network learns during training
- Include all the weights and biases from this layer
- Are combined into a single long list (vector)
This is useful for:
- Saving the model to disk
- Loading parameters from a previously trained model
- Advanced optimization techniques that need access to all parameters
The method carefully arranges all parameters in a specific order so they can be correctly restored later.
ResetState()
Resets the internal state of the primary capsule layer.
public override void ResetState()
Remarks
This method resets the internal state of the primary capsule layer, including the cached inputs, outputs, and gradients. This is useful when starting to process a new sequence or batch of data, or when implementing stateful networks.
For Beginners: This method clears the layer's memory to start fresh.
When resetting the state:
- Stored inputs and outputs from previous processing are cleared
- All calculated gradients are cleared
- The layer forgets any information from previous data batches
This is important for:
- Processing a new, unrelated batch of data
- Ensuring clean state before a new training epoch
- Preventing information from one batch affecting another
Resetting state helps ensure that each forward and backward pass is independent, which is important for correct behavior in many neural network architectures.
SetParameters(Vector<T>)
Sets the trainable parameters for the primary capsule layer.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>A vector containing all parameters to set.
Remarks
This method sets all trainable parameters of the layer from a single vector. It extracts the appropriate portions of the input vector for each parameter (convolution weights and biases). This is useful for loading saved model weights or for implementing optimization algorithms that operate on all parameters at once.
For Beginners: This method updates all the learnable values in the layer.
When setting parameters:
- The input must be a vector with the correct length
- The method extracts portions for each weight matrix and bias vector
- It places each value in its correct position
This is useful for:
- Loading a previously saved model
- Transferring parameters from another model
- Testing different parameter values
An error is thrown if the input vector doesn't have the expected number of parameters, ensuring that all matrices and vectors maintain their correct dimensions.
Exceptions
- ArgumentException
Thrown when the parameters vector has incorrect length.
UpdateParameters(T)
Updates the parameters of the primary capsule layer using the calculated gradients.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate to use for the parameter updates.
Remarks
This method updates the convolution weights and biases of the layer based on the gradients calculated during the backward pass. The learning rate controls the size of the parameter updates.
For Beginners: This method updates all the layer's weights and biases during training.
After the backward pass calculates how parameters should change, this method:
- Takes each weight and bias
- Subtracts the corresponding gradient scaled by the learning rate
- This moves the parameters in the direction that reduces errors
The learning rate controls how big each update step is:
- Smaller learning rates mean slower but more stable learning
- Larger learning rates mean faster but potentially unstable learning
This is how the layer gradually improves its ability to transform inputs into meaningful capsule representations over many training iterations.
Exceptions
- InvalidOperationException
Thrown when UpdateParameters is called before Backward.