Table of Contents

Interface IMomentumEncoder<T>

Namespace
AiDotNet.SelfSupervisedLearning
Assembly
AiDotNet.dll

Defines the contract for momentum-updated encoders used in SSL methods.

public interface IMomentumEncoder<T>

Type Parameters

T

The numeric type used for computations (typically float or double).

Remarks

For Beginners: A momentum encoder is a copy of the main encoder that updates more slowly using exponential moving average (EMA). This provides stable, consistent targets during training.

How it works:

momentum_encoder_param = m * momentum_encoder_param + (1 - m) * encoder_param

Where m is typically 0.99-0.999 (very slow updates).

Why use a momentum encoder?

  • Provides stable targets that don't change rapidly
  • Prevents model collapse in methods like BYOL
  • Ensures consistent negative embeddings in MoCo

Used by: MoCo, MoCo v2, MoCo v3, BYOL, DINO

Not used by: SimCLR, SimSiam (uses stop-gradient instead), Barlow Twins

Properties

Encoder

Gets the underlying momentum-updated encoder network.

INeuralNetwork<T> Encoder { get; }

Property Value

INeuralNetwork<T>

Momentum

Gets the momentum coefficient for EMA updates.

double Momentum { get; }

Property Value

double

Remarks

Typical values: 0.99-0.9999. Higher values = slower updates = more stable targets.

MoCo uses 0.999, BYOL uses 0.996 → 1.0 (scheduled).

Methods

CopyFromMainEncoder(INeuralNetwork<T>)

Copies all parameters from the main encoder (hard copy, not EMA).

void CopyFromMainEncoder(INeuralNetwork<T> mainEncoder)

Parameters

mainEncoder INeuralNetwork<T>

The encoder to copy from.

Remarks

Used for initialization at the start of training.

Encode(Tensor<T>)

Encodes input using the momentum encoder (no gradient computation).

Tensor<T> Encode(Tensor<T> input)

Parameters

input Tensor<T>

The input tensor to encode.

Returns

Tensor<T>

The encoded representation (detached from computation graph).

Remarks

For Beginners: The momentum encoder output is treated as a fixed target - gradients don't flow through it. Only the main encoder is trained directly.

GetParameters()

Gets all parameters of the momentum encoder.

Vector<T> GetParameters()

Returns

Vector<T>

A vector containing all parameters.

SetMomentum(double)

Sets the momentum coefficient.

void SetMomentum(double momentum)

Parameters

momentum double

New momentum value (0-1, typically 0.99-0.9999).

Remarks

Some methods schedule momentum during training (e.g., BYOL increases from 0.996 to 1.0).

SetParameters(Vector<T>)

Sets the parameters of the momentum encoder directly.

void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

The parameters to set.

UpdateFromMainEncoder(INeuralNetwork<T>)

Updates the momentum encoder parameters using EMA from the main encoder.

void UpdateFromMainEncoder(INeuralNetwork<T> mainEncoder)

Parameters

mainEncoder INeuralNetwork<T>

The main encoder to copy parameters from.

Remarks

For Beginners: Call this after each training step. The momentum encoder slowly tracks the main encoder, providing stable targets.

Update formula:

θ_momentum = m * θ_momentum + (1 - m) * θ_main

UpdateFromParameters(Vector<T>)

Updates the momentum encoder parameters using EMA from parameter vectors.

void UpdateFromParameters(Vector<T> mainEncoderParams)

Parameters

mainEncoderParams Vector<T>

Parameters from the main encoder.