Table of Contents

Class MomentumEncoder<T>

Namespace
AiDotNet.SelfSupervisedLearning
Assembly
AiDotNet.dll

Momentum-updated encoder for self-supervised learning methods.

public class MomentumEncoder<T> : IMomentumEncoder<T>

Type Parameters

T

The numeric type used for computations (typically float or double).

Inheritance
MomentumEncoder<T>
Implements
Inherited Members

Remarks

For Beginners: A momentum encoder is a copy of the main encoder that updates more slowly using exponential moving average (EMA). This provides stable, consistent targets during self-supervised training.

Update formula:

θ_momentum = m * θ_momentum + (1 - m) * θ_main

Where m is momentum (typically 0.99-0.9999).

Why slow updates?

  • Provides stable targets that don't change rapidly
  • Prevents collapse in methods like BYOL
  • Ensures consistent embeddings in memory bank (MoCo)

Example usage:

// Initialize with copy of main encoder
var momentumEncoder = new MomentumEncoder<float>(mainEncoder.Clone(), momentum: 0.999);

// Training loop:
var targets = momentumEncoder.Encode(augmentedBatch2);  // Get targets
// ... compute loss with main encoder output ...
momentumEncoder.UpdateFromMainEncoder(mainEncoder);  // EMA update

Constructors

MomentumEncoder(INeuralNetwork<T>, double)

Initializes a new instance of the MomentumEncoder class.

public MomentumEncoder(INeuralNetwork<T> encoder, double momentum = 0.999)

Parameters

encoder INeuralNetwork<T>

The encoder network (should be a copy/clone of the main encoder).

momentum double

Initial momentum coefficient (0-1, typically 0.99-0.9999).

Properties

Encoder

Gets the underlying momentum-updated encoder network.

public INeuralNetwork<T> Encoder { get; }

Property Value

INeuralNetwork<T>

Momentum

Gets the momentum coefficient for EMA updates.

public double Momentum { get; }

Property Value

double

Remarks

Typical values: 0.99-0.9999. Higher values = slower updates = more stable targets.

MoCo uses 0.999, BYOL uses 0.996 → 1.0 (scheduled).

Methods

CopyFromMainEncoder(INeuralNetwork<T>)

Copies all parameters from the main encoder (hard copy, not EMA).

public void CopyFromMainEncoder(INeuralNetwork<T> mainEncoder)

Parameters

mainEncoder INeuralNetwork<T>

The encoder to copy from.

Remarks

Used for initialization at the start of training.

Create<TEncoder>(TEncoder, double, Func<TEncoder, TEncoder>)

Creates a momentum encoder from a main encoder by cloning.

public static MomentumEncoder<T> Create<TEncoder>(TEncoder mainEncoder, double momentum, Func<TEncoder, TEncoder> cloneFunc) where TEncoder : INeuralNetwork<T>

Parameters

mainEncoder TEncoder

The main encoder to clone.

momentum double

Initial momentum coefficient.

cloneFunc Func<TEncoder, TEncoder>

Function to clone the encoder.

Returns

MomentumEncoder<T>

A new momentum encoder wrapping a cloned encoder.

Type Parameters

TEncoder

The specific encoder type.

Encode(Tensor<T>)

Encodes input using the momentum encoder (no gradient computation).

public Tensor<T> Encode(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor to encode.

Returns

Tensor<T>

The encoded representation (detached from computation graph).

Remarks

For Beginners: The momentum encoder output is treated as a fixed target - gradients don't flow through it. Only the main encoder is trained directly.

GetParameters()

Gets all parameters of the momentum encoder.

public Vector<T> GetParameters()

Returns

Vector<T>

A vector containing all parameters.

ScheduleMomentum(double, double, int, int)

Computes the scheduled momentum value based on training progress.

public static double ScheduleMomentum(double baseMomentum, double finalMomentum, int currentEpoch, int totalEpochs)

Parameters

baseMomentum double

Starting momentum value.

finalMomentum double

Final momentum value.

currentEpoch int

Current training epoch.

totalEpochs int

Total training epochs.

Returns

double

The scheduled momentum value.

Remarks

For Beginners: Some methods like BYOL schedule momentum to increase during training. This typically uses a cosine schedule from base to final momentum.

SetMomentum(double)

Sets the momentum coefficient.

public void SetMomentum(double momentum)

Parameters

momentum double

New momentum value (0-1, typically 0.99-0.9999).

Remarks

Some methods schedule momentum during training (e.g., BYOL increases from 0.996 to 1.0).

SetParameters(Vector<T>)

Sets the parameters of the momentum encoder directly.

public void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

The parameters to set.

UpdateFromMainEncoder(INeuralNetwork<T>)

Updates the momentum encoder parameters using EMA from the main encoder.

public void UpdateFromMainEncoder(INeuralNetwork<T> mainEncoder)

Parameters

mainEncoder INeuralNetwork<T>

The main encoder to copy parameters from.

Remarks

For Beginners: Call this after each training step. The momentum encoder slowly tracks the main encoder, providing stable targets.

Update formula:

θ_momentum = m * θ_momentum + (1 - m) * θ_main

UpdateFromParameters(Vector<T>)

Updates the momentum encoder parameters using EMA from parameter vectors.

public void UpdateFromParameters(Vector<T> mainEncoderParams)

Parameters

mainEncoderParams Vector<T>

Parameters from the main encoder.