Class ContinuumMemorySystemLayer<T>
- Namespace
- AiDotNet.NeuralNetworks.Layers
- Assembly
- AiDotNet.dll
Continuum Memory System (CMS) layer for neural networks. Implements a sequential chain of MLP blocks with different update frequencies. Based on Equations 30-31 from "Nested Learning" paper. yt = MLP^(fk)(MLP^(fk-1)(...MLP^(f1)(xt)))
public class ContinuumMemorySystemLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type
- Inheritance
-
LayerBase<T>ContinuumMemorySystemLayer<T>
- Implements
-
ILayer<T>
- Inherited Members
Constructors
ContinuumMemorySystemLayer(int[], int, int, int[]?, T[]?, IEngine?)
Creates a CMS layer as a chain of MLP blocks.
public ContinuumMemorySystemLayer(int[] inputShape, int hiddenDim, int numFrequencyLevels = 3, int[]? updateFrequencies = null, T[]? learningRates = null, IEngine? engine = null)
Parameters
inputShapeint[]Input shape
hiddenDimintHidden dimension for each MLP block
numFrequencyLevelsintupdateFrequenciesint[]Update frequencies for each level (f1, f2, ..., fk)
learningRatesT[]Learning rates per level
engineIEngineThe computation engine for vectorized operations. Defaults to CPU if not specified.
Properties
ChunkSizes
Gets the chunk sizes for gradient accumulation.
public int[] ChunkSizes { get; }
Property Value
- int[]
SupportsGpuExecution
Indicates whether this layer supports GPU execution. CMS supports GPU because it chains DenseLayer blocks which all support GPU.
protected override bool SupportsGpuExecution { get; }
Property Value
SupportsJitCompilation
Gets a value indicating whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
truebecause ContinuumMemorySystemLayer is a chain of DenseLayer blocks, each of which supports JIT compilation. The update frequency logic is only used during training and does not affect inference.
SupportsTraining
Indicates whether this layer supports training. CMS always supports training.
public override bool SupportsTraining { get; }
Property Value
UpdateFrequencies
Gets the update frequencies for each level.
public int[] UpdateFrequencies { get; }
Property Value
- int[]
Methods
Backward(Tensor<T>)
Performs the backward pass of the layer.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>The gradient of the loss with respect to the layer's output.
Returns
- Tensor<T>
The gradient of the loss with respect to the layer's input.
Remarks
This abstract method must be implemented by derived classes to define the backward pass of the layer. The backward pass propagates error gradients from the output of the layer back to its input, and calculates gradients for any trainable parameters.
For Beginners: This method is used during training to calculate how the layer's input should change to reduce errors.
During the backward pass:
- The layer receives information about how its output contributed to errors
- It calculates how its parameters should change to reduce errors
- It calculates how its input should change, which will be used by earlier layers
This is the core of how neural networks learn from their mistakes during training.
ClearGradients()
Clears all accumulated gradients across all levels.
public override void ClearGradients()
ConsolidateMemory()
Consolidates memory from faster to slower levels. Transfers knowledge from lower-level (faster) MLPs to higher-level (slower) MLPs.
public void ConsolidateMemory()
ExportComputationGraph(List<ComputationNode<T>>)
Exports the layer's computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the layer's operation.
Remarks
This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.
For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.
To support JIT compilation, a layer must:
- Implement this method to export its computation graph
- Set SupportsJitCompilation to true
- Use ComputationNode and TensorOperations to build the graph
All layers are required to implement this method, even if they set SupportsJitCompilation = false.
Forward(Tensor<T>)
Performs the forward pass of the layer.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to process.
Returns
- Tensor<T>
The output tensor after processing.
Remarks
This abstract method must be implemented by derived classes to define the forward pass of the layer. The forward pass transforms the input tensor according to the layer's operation and activation function.
For Beginners: This method processes your data through the layer.
The forward pass:
- Takes input data from the previous layer or the network input
- Applies the layer's specific transformation (like convolution or matrix multiplication)
- Applies any activation function
- Passes the result to the next layer
This is where the actual data processing happens during both training and prediction.
ForwardGpu(params IGpuTensor<T>[])
GPU-accelerated forward pass chaining through all MLP blocks. Each DenseLayer block handles its own GPU operations (GEMM, bias, activation). yt = MLP^(fk)(MLP^(fk-1)(...MLP^(f1)(xt)))
public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)
Parameters
inputsIGpuTensor<T>[]GPU-resident input tensors (uses first input).
Returns
- IGpuTensor<T>
GPU-resident output tensor after chaining through all MLP blocks.
GetMLPBlocks()
Gets the MLP blocks in the chain.
public DenseLayer<T>[] GetMLPBlocks()
Returns
- DenseLayer<T>[]
GetParameterGradients()
Gets the parameter gradients for all MLP blocks. Returns concatenated gradients from all levels.
public override Vector<T> GetParameterGradients()
Returns
- Vector<T>
GetParameters()
Gets all parameters from all MLP blocks in the chain. Returns a concatenated vector of all parameters from all levels.
public override Vector<T> GetParameters()
Returns
- Vector<T>
Concatenated parameter vector
ResetMemory()
Resets all MLP blocks in the chain.
public void ResetMemory()
ResetState()
Resets the state of the layer (required by LayerBase). Resets all MLP blocks and clears gradient accumulation.
public override void ResetState()
SetParameters(Vector<T>)
Sets all parameters for all MLP blocks in the chain. Distributes the parameter vector across all levels.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>Concatenated parameter vector
UpdateParameters(T)
Updates parameters using the specified learning rate. This is a no-op for CMS because parameters are updated exclusively via UpdateLevelParameters when chunk counters trigger (i ≡ 0 mod C(ℓ)). Updating here would double-apply gradients.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTLearning rate (unused - each level has its own rate)