Class MomentumEncoder<T>

Namespace: AiDotNet.SelfSupervisedLearning

Assembly: AiDotNet.dll

Momentum-updated encoder for self-supervised learning methods.

public class MomentumEncoder<T> : IMomentumEncoder<T>

Type Parameters

T: The numeric type used for computations (typically float or double).

Inheritance: object

MomentumEncoder<T>

Implements: IMomentumEncoder<T>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

For Beginners: A momentum encoder is a copy of the main encoder that updates more slowly using exponential moving average (EMA). This provides stable, consistent targets during self-supervised training.

Update formula:

θ_momentum = m * θ_momentum + (1 - m) * θ_main

Where m is momentum (typically 0.99-0.9999).

Why slow updates?

Provides stable targets that don't change rapidly
Prevents collapse in methods like BYOL
Ensures consistent embeddings in memory bank (MoCo)

Example usage:

// Initialize with copy of main encoder
var momentumEncoder = new MomentumEncoder<float>(mainEncoder.Clone(), momentum: 0.999);

// Training loop:
var targets = momentumEncoder.Encode(augmentedBatch2);  // Get targets
// ... compute loss with main encoder output ...
momentumEncoder.UpdateFromMainEncoder(mainEncoder);  // EMA update

Constructors

MomentumEncoder(INeuralNetwork<T>, double)

Initializes a new instance of the MomentumEncoder class.

public MomentumEncoder(INeuralNetwork<T> encoder, double momentum = 0.999)

Parameters

encoder INeuralNetwork<T>: The encoder network (should be a copy/clone of the main encoder).
momentum double: Initial momentum coefficient (0-1, typically 0.99-0.9999).

Properties

Encoder

Gets the underlying momentum-updated encoder network.

public INeuralNetwork<T> Encoder { get; }

Property Value

INeuralNetwork<T>

Momentum

Gets the momentum coefficient for EMA updates.

public double Momentum { get; }

Property Value

double

Remarks

Typical values: 0.99-0.9999. Higher values = slower updates = more stable targets.

MoCo uses 0.999, BYOL uses 0.996 → 1.0 (scheduled).

Methods

CopyFromMainEncoder(INeuralNetwork<T>)

Copies all parameters from the main encoder (hard copy, not EMA).

public void CopyFromMainEncoder(INeuralNetwork<T> mainEncoder)

Parameters

mainEncoder INeuralNetwork<T>: The encoder to copy from.

Remarks

Used for initialization at the start of training.

Create<TEncoder>(TEncoder, double, Func<TEncoder, TEncoder>)

Creates a momentum encoder from a main encoder by cloning.

public static MomentumEncoder<T> Create<TEncoder>(TEncoder mainEncoder, double momentum, Func<TEncoder, TEncoder> cloneFunc) where TEncoder : INeuralNetwork<T>

Parameters

mainEncoder TEncoder: The main encoder to clone.
momentum double: Initial momentum coefficient.
cloneFunc Func<TEncoder, TEncoder>: Function to clone the encoder.

Returns

MomentumEncoder<T>: A new momentum encoder wrapping a cloned encoder.

Type Parameters

TEncoder: The specific encoder type.

Encode(Tensor<T>)

Encodes input using the momentum encoder (no gradient computation).

public Tensor<T> Encode(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor to encode.

Returns

Tensor<T>: The encoded representation (detached from computation graph).

Remarks

For Beginners: The momentum encoder output is treated as a fixed target - gradients don't flow through it. Only the main encoder is trained directly.

GetParameters()

Gets all parameters of the momentum encoder.

public Vector<T> GetParameters()

Returns

Vector<T>: A vector containing all parameters.

ScheduleMomentum(double, double, int, int)

Computes the scheduled momentum value based on training progress.

public static double ScheduleMomentum(double baseMomentum, double finalMomentum, int currentEpoch, int totalEpochs)

Parameters

baseMomentum double: Starting momentum value.
finalMomentum double: Final momentum value.
currentEpoch int: Current training epoch.
totalEpochs int: Total training epochs.

Returns

double: The scheduled momentum value.

Remarks

For Beginners: Some methods like BYOL schedule momentum to increase during training. This typically uses a cosine schedule from base to final momentum.

SetMomentum(double)

Sets the momentum coefficient.

public void SetMomentum(double momentum)

Parameters

momentum double: New momentum value (0-1, typically 0.99-0.9999).

Remarks

Some methods schedule momentum during training (e.g., BYOL increases from 0.996 to 1.0).

SetParameters(Vector<T>)

Sets the parameters of the momentum encoder directly.

public void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: The parameters to set.

UpdateFromMainEncoder(INeuralNetwork<T>)

Updates the momentum encoder parameters using EMA from the main encoder.

public void UpdateFromMainEncoder(INeuralNetwork<T> mainEncoder)

Parameters

mainEncoder INeuralNetwork<T>: The main encoder to copy parameters from.

Remarks

For Beginners: Call this after each training step. The momentum encoder slowly tracks the main encoder, providing stable targets.

Update formula:

θ_momentum = m * θ_momentum + (1 - m) * θ_main

UpdateFromParameters(Vector<T>)

Updates the momentum encoder parameters using EMA from parameter vectors.

public void UpdateFromParameters(Vector<T> mainEncoderParams)

Parameters

mainEncoderParams Vector<T>: Parameters from the main encoder.

Table of Contents

Class MomentumEncoder<T>

Type Parameters

Remarks

Constructors

MomentumEncoder(INeuralNetwork<T>, double)

Parameters

Properties

Encoder

Property Value

Momentum

Property Value

Remarks

Methods

CopyFromMainEncoder(INeuralNetwork<T>)

Parameters

Remarks

Create<TEncoder>(TEncoder, double, Func<TEncoder, TEncoder>)

Parameters

Returns

Type Parameters

Encode(Tensor<T>)

Parameters

Returns

Remarks

GetParameters()

Returns

ScheduleMomentum(double, double, int, int)

Parameters

Returns

Remarks

SetMomentum(double)

Parameters

Remarks

SetParameters(Vector<T>)

Parameters

UpdateFromMainEncoder(INeuralNetwork<T>)

Parameters

Remarks

UpdateFromParameters(Vector<T>)

Parameters