Class ContinuumMemorySystemLayer<T>

Namespace: AiDotNet.NeuralNetworks.Layers

Assembly: AiDotNet.dll

Continuum Memory System (CMS) layer for neural networks. Implements a sequential chain of MLP blocks with different update frequencies. Based on Equations 30-31 from "Nested Learning" paper. yt = MLP^(fk)(MLP^(fk-1)(...MLP^(f1)(xt)))

public class ContinuumMemorySystemLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T: The numeric type

Inheritance: object

LayerBase<T>

ContinuumMemorySystemLayer<T>

Implements: ILayer<T>

IJitCompilable<T>

IDiagnosticsProvider

IWeightLoadable<T>

IDisposable

Inherited Members: LayerBase<T>.Engine

LayerBase<T>.ScalarActivation

LayerBase<T>.VectorActivation

LayerBase<T>.UsingVectorActivation

LayerBase<T>.NumOps

LayerBase<T>.Random

LayerBase<T>.Parameters

LayerBase<T>.ParameterGradients

LayerBase<T>.InputShape

LayerBase<T>.InputShapes

LayerBase<T>.UpdateInputShape(int[])

LayerBase<T>.OutputShape

LayerBase<T>.IsTrainingMode

LayerBase<T>.InitializationStrategy

LayerBase<T>.IsInitialized

LayerBase<T>.InitializationLock

LayerBase<T>.EnsureInitialized()

LayerBase<T>.UseAutodiff

LayerBase<T>.SetTrainingMode(bool)

LayerBase<T>.GetInputShape()

LayerBase<T>.GetInputShapes()

LayerBase<T>.GetOutputShape()

LayerBase<T>.GetWeights()

LayerBase<T>.GetBiases()

LayerBase<T>.MapActivationToFused()

LayerBase<T>.SupportsGpuTraining

LayerBase<T>.CanExecuteOnGpu

LayerBase<T>.CanTrainOnGpu

LayerBase<T>.BackwardGpu(IGpuTensor<T>)

LayerBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

LayerBase<T>.UploadWeightsToGpu()

LayerBase<T>.DownloadWeightsFromGpu()

LayerBase<T>.ZeroGradientsGpu()

LayerBase<T>.GetActivationTypes()

LayerBase<T>.Forward(params Tensor<T>[])

LayerBase<T>.ApplyActivation(Tensor<T>)

LayerBase<T>.ApplyActivation(Vector<T>)

LayerBase<T>.ActivateTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ActivateTensor(IVectorActivationFunction<T>, Tensor<T>)

LayerBase<T>.CalculateInputShape(int, int, int)

LayerBase<T>.CalculateOutputShape(int, int, int)

LayerBase<T>.Clone()

LayerBase<T>.DerivativeTensor(IActivationFunction<T>, Tensor<T>)

LayerBase<T>.ApplyActivationDerivative(T, T)

LayerBase<T>.ApplyActivationDerivative(Tensor<T>, Tensor<T>)

LayerBase<T>.ComputeActivationJacobian(Vector<T>)

LayerBase<T>.ApplyActivationDerivative(Vector<T>, Vector<T>)

LayerBase<T>.UpdateParameters(Vector<T>)

LayerBase<T>.ParameterCount

LayerBase<T>.Serialize(BinaryWriter)

LayerBase<T>.Deserialize(BinaryReader)

LayerBase<T>.GetDiagnostics()

LayerBase<T>.ApplyActivationToGraph(ComputationNode<T>)

LayerBase<T>.CanActivationBeJitted()

LayerBase<T>.RegisterTrainableParameter(Tensor<T>, PersistentTensorRole)

LayerBase<T>.InvalidateTrainableParameter(Tensor<T>)

LayerBase<T>.HasGpuActivation()

LayerBase<T>.ApplyActivationForwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.ApplyActivationBackwardGpu(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int)

LayerBase<T>.GetFusedActivationType()

LayerBase<T>.ApplyGpuActivation(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, int, FusedActivationType)

LayerBase<T>.ApplyGpuActivationBackward(IDirectGpuBackend, IGpuBuffer, IGpuBuffer, IGpuBuffer, IGpuBuffer, int, FusedActivationType, float)

LayerBase<T>.Dispose()

LayerBase<T>.Dispose(bool)

LayerBase<T>.WeightParameterName

LayerBase<T>.BiasParameterName

LayerBase<T>.SetWeights(Tensor<T>)

LayerBase<T>.SetBiases(Tensor<T>)

LayerBase<T>.GetParameterNames()

LayerBase<T>.TryGetParameter(string, out Tensor<T>)

LayerBase<T>.SetParameter(string, Tensor<T>)

LayerBase<T>.GetParameterShape(string)

LayerBase<T>.NamedParameterCount

LayerBase<T>.ValidateWeights(IEnumerable<string>, Func<string, string>)

LayerBase<T>.LoadWeights(Dictionary<string, Tensor<T>>, Func<string, string>, bool)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Constructors

ContinuumMemorySystemLayer(int[], int, int, int[]?, T[]?, IEngine?)

Creates a CMS layer as a chain of MLP blocks.

public ContinuumMemorySystemLayer(int[] inputShape, int hiddenDim, int numFrequencyLevels = 3, int[]? updateFrequencies = null, T[]? learningRates = null, IEngine? engine = null)

Parameters

inputShape int[]: Input shape
hiddenDim int: Hidden dimension for each MLP block
numFrequencyLevels int
updateFrequencies int[]: Update frequencies for each level (f1, f2, ..., fk)
learningRates T[]: Learning rates per level
engine IEngine: The computation engine for vectorized operations. Defaults to CPU if not specified.

Properties

ChunkSizes

Gets the chunk sizes for gradient accumulation.

public int[] ChunkSizes { get; }

Property Value

int[]

SupportsGpuExecution

Indicates whether this layer supports GPU execution. CMS supports GPU because it chains DenseLayer blocks which all support GPU.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

SupportsJitCompilation

Gets a value indicating whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool: true because ContinuumMemorySystemLayer is a chain of DenseLayer blocks, each of which supports JIT compilation. The update frequency logic is only used during training and does not affect inference.

SupportsTraining

Indicates whether this layer supports training. CMS always supports training.

public override bool SupportsTraining { get; }

Property Value

bool

UpdateFrequencies

Gets the update frequencies for each level.

public int[] UpdateFrequencies { get; }

Property Value

int[]

Methods

Backward(Tensor<T>)

Performs the backward pass of the layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>: The gradient of the loss with respect to the layer's output.

Returns

Tensor<T>: The gradient of the loss with respect to the layer's input.

Remarks

This abstract method must be implemented by derived classes to define the backward pass of the layer. The backward pass propagates error gradients from the output of the layer back to its input, and calculates gradients for any trainable parameters.

For Beginners: This method is used during training to calculate how the layer's input should change to reduce errors.

During the backward pass:

The layer receives information about how its output contributed to errors
It calculates how its parameters should change to reduce errors
It calculates how its input should change, which will be used by earlier layers

This is the core of how neural networks learn from their mistakes during training.

ClearGradients()

Clears all accumulated gradients across all levels.

public override void ClearGradients()

ConsolidateMemory()

Consolidates memory from faster to slower levels. Transfers knowledge from lower-level (faster) MLPs to higher-level (slower) MLPs.

public void ConsolidateMemory()

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer's computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>: List to populate with input computation nodes.

Returns

ComputationNode<T>: The output computation node representing the layer's operation.

Remarks

This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.

For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.

To support JIT compilation, a layer must:

Implement this method to export its computation graph
Set SupportsJitCompilation to true
Use ComputationNode and TensorOperations to build the graph

All layers are required to implement this method, even if they set SupportsJitCompilation = false.

Forward(Tensor<T>)

Performs the forward pass of the layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor to process.

Returns

Tensor<T>: The output tensor after processing.

Remarks

This abstract method must be implemented by derived classes to define the forward pass of the layer. The forward pass transforms the input tensor according to the layer's operation and activation function.

For Beginners: This method processes your data through the layer.

The forward pass:

Takes input data from the previous layer or the network input
Applies the layer's specific transformation (like convolution or matrix multiplication)
Applies any activation function
Passes the result to the next layer

This is where the actual data processing happens during both training and prediction.

ForwardGpu(params IGpuTensor<T>[])

GPU-accelerated forward pass chaining through all MLP blocks. Each DenseLayer block handles its own GPU operations (GEMM, bias, activation). yt = MLP^(fk)(MLP^(fk-1)(...MLP^(f1)(xt)))

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]: GPU-resident input tensors (uses first input).

Returns

IGpuTensor<T>: GPU-resident output tensor after chaining through all MLP blocks.

GetMLPBlocks()

Gets the MLP blocks in the chain.

public DenseLayer<T>[] GetMLPBlocks()

Returns

DenseLayer<T>[]

GetParameterGradients()

Gets the parameter gradients for all MLP blocks. Returns concatenated gradients from all levels.

public override Vector<T> GetParameterGradients()

Returns

Vector<T>

GetParameters()

Gets all parameters from all MLP blocks in the chain. Returns a concatenated vector of all parameters from all levels.

public override Vector<T> GetParameters()

Returns

Vector<T>: Concatenated parameter vector

ResetMemory()

Resets all MLP blocks in the chain.

public void ResetMemory()

ResetState()

Resets the state of the layer (required by LayerBase). Resets all MLP blocks and clears gradient accumulation.

public override void ResetState()

SetParameters(Vector<T>)

Sets all parameters for all MLP blocks in the chain. Distributes the parameter vector across all levels.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: Concatenated parameter vector

UpdateParameters(T)

Updates parameters using the specified learning rate. This is a no-op for CMS because parameters are updated exclusively via UpdateLevelParameters when chunk counters trigger (i ≡ 0 mod C(ℓ)). Updating here would double-apply gradients.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T: Learning rate (unused - each level has its own rate)

Table of Contents

Class ContinuumMemorySystemLayer<T>

Type Parameters

Constructors

ContinuumMemorySystemLayer(int[], int, int, int[]?, T[]?, IEngine?)

Parameters

Properties

ChunkSizes

Property Value

SupportsGpuExecution

Property Value

SupportsJitCompilation

Property Value

SupportsTraining

Property Value

UpdateFrequencies

Property Value

Methods

Backward(Tensor<T>)

Parameters

Returns

Remarks

ClearGradients()

ConsolidateMemory()

ExportComputationGraph(List<ComputationNode<T>>)

Parameters

Returns

Remarks

Forward(Tensor<T>)

Parameters

Returns

Remarks

ForwardGpu(params IGpuTensor<T>[])

Parameters

Returns

GetMLPBlocks()

Returns

GetParameterGradients()

Returns

GetParameters()

Returns

ResetMemory()

ResetState()

SetParameters(Vector<T>)

Parameters

UpdateParameters(T)

Parameters