Class ReconstructionLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Represents a reconstruction layer that uses multiple fully connected layers to transform inputs into outputs.
public class ReconstructionLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
LayerBase<T>ReconstructionLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
The ReconstructionLayer is a composite layer that consists of three fully connected layers in sequence. It is typically used in autoencoders or similar architectures to reconstruct the original input from a compressed representation. The layer provides a deeper transformation path through multiple hidden layers, allowing it to learn more complex reconstruction functions than a single layer could.
For Beginners: This layer works like a mini-network within your neural network.
Think of the ReconstructionLayer as a specialized team of artists:
- The first artist (first fully connected layer) makes a rough sketch
- The second artist (second fully connected layer) adds details to the sketch
- The third artist (third fully connected layer) finalizes the artwork
In an autoencoder network (a common use for this layer):
- Earlier layers compress your data into a compact form (like squeezing information)
- This reconstruction layer carefully expands that compact form back to the original size
- It learns how to restore information that was "squeezed out" during compression
For example, if you're building an image autoencoder, this layer would help transform the compressed representation back into an image that looks like the original.
By using three layers instead of just one, this layer can learn more sophisticated patterns for reconstructing the data.
Constructors
ReconstructionLayer(int, int, int, int, IActivationFunction<T>?, IActivationFunction<T>?)
Initializes a new instance of the ReconstructionLayer<T> class with scalar activation functions.
public ReconstructionLayer(int inputDimension, int hidden1Dimension, int hidden2Dimension, int outputDimension, IActivationFunction<T>? hiddenActivation = null, IActivationFunction<T>? outputActivation = null)
Parameters
inputDimensionintThe size of the input to the layer.
hidden1DimensionintThe size of the first hidden layer.
hidden2DimensionintThe size of the second hidden layer.
outputDimensionintThe size of the output from the layer.
hiddenActivationIActivationFunction<T>The activation function to apply to hidden layers. Defaults to ReLU if not specified.
outputActivationIActivationFunction<T>The activation function to apply to the output layer. Defaults to Sigmoid if not specified.
Remarks
This constructor creates a new ReconstructionLayer with the specified dimensions and scalar activation functions. It initializes three fully connected layers in sequence, with the output of each layer feeding into the input of the next. The hidden layers use the specified hidden activation function (or ReLU by default), and the output layer uses the specified output activation function (or Sigmoid by default).
For Beginners: This creates a new reconstruction layer for your neural network using simple activation functions.
When you create this layer, you specify:
- inputDimension: How many features come into the layer
- hidden1Dimension: How many neurons are in the first internal layer
- hidden2Dimension: How many neurons are in the second internal layer
- outputDimension: How many features you want in the output
- hiddenActivation: How to transform values in the hidden layers (defaults to ReLU)
- outputActivation: How to transform the final output (defaults to Sigmoid)
The hidden dimensions control the "processing power" of the layer:
- Larger hidden dimensions can learn more complex patterns but require more memory
- In autoencoders, these dimensions are often larger than the input/output to expand the compressed representation
The activation functions shape how information flows through the layer:
- ReLU is good for hidden layers as it helps with gradient flow
- Sigmoid is good for outputs that should be between 0 and 1 (like pixel values in images)
ReconstructionLayer(int, int, int, int, IVectorActivationFunction<T>?, IVectorActivationFunction<T>?)
Initializes a new instance of the ReconstructionLayer<T> class with vector activation functions.
public ReconstructionLayer(int inputDimension, int hidden1Dimension, int hidden2Dimension, int outputDimension, IVectorActivationFunction<T>? hiddenVectorActivation = null, IVectorActivationFunction<T>? outputVectorActivation = null)
Parameters
inputDimensionintThe size of the input to the layer.
hidden1DimensionintThe size of the first hidden layer.
hidden2DimensionintThe size of the second hidden layer.
outputDimensionintThe size of the output from the layer.
hiddenVectorActivationIVectorActivationFunction<T>The vector activation function to apply to hidden layers. Defaults to ReLU if not specified.
outputVectorActivationIVectorActivationFunction<T>The vector activation function to apply to the output layer. Defaults to Sigmoid if not specified.
Remarks
This constructor creates a new ReconstructionLayer with the specified dimensions and vector activation functions. It initializes three fully connected layers in sequence, with the output of each layer feeding into the input of the next. The hidden layers use the specified hidden vector activation function (or ReLU by default), and the output layer uses the specified output vector activation function (or Sigmoid by default). Vector activation functions operate on entire vectors at once, allowing for interactions between different elements.
For Beginners: This creates a new reconstruction layer for your neural network using advanced activation functions.
When you create this layer, you specify the same dimensions as the scalar version, but with vector activation functions:
- Vector activations process all outputs together as a group, rather than individually
- This can capture relationships between different output elements
- It's useful for outputs that need to maintain certain properties across all values
For example, in an image generation task:
- A vector activation might ensure proper contrast across the entire image
- It could maintain relationships between neighboring pixels
This version of the constructor is more advanced and used less frequently than the scalar version, but it can be powerful for specific types of reconstruction tasks.
Properties
ParameterCount
Gets the total number of trainable parameters in the reconstruction layer.
public override int ParameterCount { get; }
Property Value
- int
The sum of parameter counts from all three fully connected layers.
Remarks
This property returns the total number of trainable parameters (weights and biases) across all three fully connected layers that make up this reconstruction layer. This is useful for monitoring the complexity of the layer or for parameter initialization strategies.
For Beginners: This property tells you how many numbers the layer can adjust during training.
Each parameter is a number that the neural network learns:
- More parameters mean the layer can learn more complex patterns
- More parameters also require more training data and time
- This layer has parameters in all three of its internal layers
Think of parameters like knobs that the network can turn to get better results. This property tells you the total number of knobs available to this layer.
SupportsGpuExecution
Gets a value indicating whether this layer supports GPU execution.
protected override bool SupportsGpuExecution { get; }
Property Value
Remarks
Returns true since the internal FullyConnectedLayers support GPU execution.
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if the layer can be JIT compiled, false otherwise.
Remarks
This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.
For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.
Layers should return false if they:
- Have not yet implemented a working ExportComputationGraph()
- Use dynamic operations that change based on input data
- Are too simple to benefit from JIT compilation
When false, the layer will use the standard Forward() method instead.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
Always
truefor ReconstructionLayer, indicating that the layer can be trained through backpropagation.
Remarks
This property indicates that the ReconstructionLayer has trainable parameters that can be optimized during the training process using backpropagation. The actual parameters are contained within the three fully connected layers that make up this layer.
For Beginners: This property tells you if the layer can learn from data.
A value of true means:
- The layer has values that can be adjusted during training
- It will improve its performance as it sees more data
- It participates in the learning process of the neural network
During training, this layer will learn how to best reconstruct outputs from inputs, adapting its internal parameters to the specific patterns in your data.
Methods
Backward(Tensor<T>)
Performs the backward pass of the reconstruction layer.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the layer's output.
Returns
- Tensor<T>
The gradient of the loss with respect to the layer's input.
Remarks
This method implements the backward pass of the reconstruction layer, which is used during training to propagate error gradients back through the network. It sequentially passes the gradient backward through the three fully connected layers in reverse order, with each layer's input gradient becoming the output gradient for the previous layer.
For Beginners: This method is used during training to calculate how the layer should change to reduce errors.
During the backward pass:
- The error gradient arrives at the third layer (the output layer)
- The third layer calculates its updates and passes the gradient to the second layer
- The second layer calculates its updates and passes the gradient to the first layer
- The first layer calculates its updates and passes the gradient to the previous layer in the network
This reverse flow of information is like feedback being passed backward:
- "The final output was wrong in this way" (outputGradient)
- "To fix that, the third layer should change like this"
- "To help the third layer, the second layer should change like this"
- "To help the second layer, the first layer should change like this"
- "To help the first layer, the previous layers should change like this" (return value)
Deserialize(BinaryReader)
Deserializes the reconstruction layer from a binary reader.
public override void Deserialize(BinaryReader reader)
Parameters
readerBinaryReaderThe binary reader to deserialize from.
Remarks
This method deserializes the state of the reconstruction layer from a binary reader. It reads the vector activation flag and then deserializes each of the three fully connected layers in sequence. This is useful for loading the layer's state from disk or receiving it over a network.
For Beginners: This method loads a previously saved layer state.
When deserializing:
- First, it loads whether vector activation is used or not
- Then, it asks each of the three internal layers to load their states
- The result is a complete restoration of a previously saved layer
This is useful for:
- Loading a trained model from disk
- Continuing training from where you left off
- Using a model that someone else trained
Think of it like reconstructing the exact state of the layer from a detailed blueprint.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the layer's computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the layer's operation.
Remarks
This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.
For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.
To support JIT compilation, a layer must:
- Implement this method to export its computation graph
- Set SupportsJitCompilation to true
- Use ComputationNode and TensorOperations to build the graph
All layers are required to implement this method, even if they set SupportsJitCompilation = false.
Forward(Tensor<T>)
Performs the forward pass of the reconstruction layer.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to process.
Returns
- Tensor<T>
The output tensor after reconstruction processing.
Remarks
This method implements the forward pass of the reconstruction layer. It sequentially passes the input through the three fully connected layers, with each layer's output becoming the input to the next layer. The final output represents the reconstructed data.
For Beginners: This method processes your data through the reconstruction layer.
During the forward pass:
- Your input data goes through the first fully connected layer
- The output from the first layer goes through the second layer
- The output from the second layer goes through the third layer
- The output from the third layer is the final reconstruction
This step-by-step transformation allows the layer to gradually reconstruct complex patterns. Each layer in the sequence adds detail to the reconstruction, similar to how an artist might start with a rough sketch and gradually add more detail.
ForwardGpu(params IGpuTensor<T>[])
Performs GPU-accelerated forward pass by chaining through sublayers.
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]Input GPU tensors (uses first input).
Returns
- IGpuTensor<T>
GPU-resident output tensor.
GetParameters()
Gets all trainable parameters of the reconstruction layer as a single vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all trainable parameters from all three fully connected layers.
Remarks
This method retrieves all trainable parameters of the reconstruction layer as a single vector. It collects the parameters from each of the three fully connected layers in sequence and concatenates them into a single vector. This is useful for optimization algorithms that operate on all parameters at once, or for saving and loading model weights.
For Beginners: This method collects all the learnable values from the reconstruction layer.
The parameters:
- Are the weights and biases from all three internal layers
- Control how the layer processes information
- Are returned as a single list (vector)
This is useful for:
- Saving the model to disk
- Loading parameters from a previously trained model
- Advanced optimization techniques that need access to all parameters
The parameters from the first layer come first in the vector, followed by the second layer's parameters, and finally the third layer's parameters.
ResetState()
Resets the internal state of the reconstruction layer.
public override void ResetState()
Remarks
This method resets the internal state of the reconstruction layer by resetting the state of each of the three fully connected layers. This is useful when starting to process a new sequence or batch of data.
For Beginners: This method clears the layer's memory to start fresh.
When resetting the state:
- Each of the three internal layers clears its own temporary state
- The layer forgets any information from previous batches
- The learned parameters (weights and biases) are not reset
This is important for:
- Processing a new, unrelated batch of data
- Preventing information from one batch affecting another
- Starting a new training episode
Think of it like wiping a chalkboard clean before drawing something new, but keeping all the chalk and erasers (the parameters) you've collected.
Serialize(BinaryWriter)
Serializes the reconstruction layer to a binary writer.
public override void Serialize(BinaryWriter writer)
Parameters
writerBinaryWriterThe binary writer to serialize to.
Remarks
This method serializes the state of the reconstruction layer to a binary writer. It writes the vector activation flag and then serializes each of the three fully connected layers in sequence. This is useful for saving the layer's state to disk or sending it over a network.
For Beginners: This method saves the layer's state so it can be loaded later.
When serializing:
- First, it saves whether vector activation is used or not
- Then, it asks each of the three internal layers to save their states
- The result is a complete snapshot of the layer that can be restored
This is useful for:
- Saving a trained model to disk
- Pausing training and continuing later
- Sharing a trained model with others
Think of it like taking a detailed photograph of the layer's current state that can be used to recreate it exactly.
SetParameters(Vector<T>)
Sets the trainable parameters of the reconstruction layer.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>A vector containing all parameters for all three fully connected layers.
Remarks
This method sets the trainable parameters of the reconstruction layer from a single vector. It extracts the appropriate portions of the vector for each of the three fully connected layers and sets their parameters accordingly. This is useful for loading saved model weights or for implementing optimization algorithms that operate on all parameters at once.
For Beginners: This method updates all the weights and biases in the reconstruction layer.
When setting parameters:
- The input must be a vector with the correct total length
- The method divides this vector into three parts, one for each internal layer
- Each internal layer gets its own specific section of parameters
This is useful for:
- Loading a previously saved model
- Transferring parameters from another model
- Testing different parameter values
An error is thrown if the input vector doesn't have the expected number of parameters.
Exceptions
- ArgumentException
Thrown when the parameters vector has incorrect length.
UpdateParameters(T)
Updates the parameters of the reconstruction layer using the calculated gradients.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate to use for the parameter updates.
Remarks
This method updates the parameters of all three fully connected layers based on the gradients calculated during the backward pass. The learning rate controls the size of the parameter updates. This method should be called after the backward pass to apply the calculated updates.
For Beginners: This method updates the layer's internal values during training.
When updating parameters:
- Each of the three internal layers updates its own weights and biases
- The learning rate controls how big each update step is
- All three layers are updated in a single call to this method
Think of it like each of the three artists adjusting their technique based on feedback:
- "My lines were too thick, I'll make them thinner next time"
- "I missed some details, I'll pay more attention to them"
- "My colors were off, I'll mix them differently"
Smaller learning rates mean slower but more stable learning, while larger learning rates mean faster but potentially unstable learning.