Interface IMomentumEncoder<T>
- Namespace
- AiDotNet.SelfSupervisedLearning
- Assembly
- AiDotNet.dll
Defines the contract for momentum-updated encoders used in SSL methods.
public interface IMomentumEncoder<T>
Type Parameters
TThe numeric type used for computations (typically float or double).
Remarks
For Beginners: A momentum encoder is a copy of the main encoder that updates more slowly using exponential moving average (EMA). This provides stable, consistent targets during training.
How it works:
momentum_encoder_param = m * momentum_encoder_param + (1 - m) * encoder_param
Where m is typically 0.99-0.999 (very slow updates).
Why use a momentum encoder?
- Provides stable targets that don't change rapidly
- Prevents model collapse in methods like BYOL
- Ensures consistent negative embeddings in MoCo
Used by: MoCo, MoCo v2, MoCo v3, BYOL, DINO
Not used by: SimCLR, SimSiam (uses stop-gradient instead), Barlow Twins
Properties
Encoder
Gets the underlying momentum-updated encoder network.
INeuralNetwork<T> Encoder { get; }
Property Value
Momentum
Gets the momentum coefficient for EMA updates.
double Momentum { get; }
Property Value
Remarks
Typical values: 0.99-0.9999. Higher values = slower updates = more stable targets.
MoCo uses 0.999, BYOL uses 0.996 → 1.0 (scheduled).
Methods
CopyFromMainEncoder(INeuralNetwork<T>)
Copies all parameters from the main encoder (hard copy, not EMA).
void CopyFromMainEncoder(INeuralNetwork<T> mainEncoder)
Parameters
mainEncoderINeuralNetwork<T>The encoder to copy from.
Remarks
Used for initialization at the start of training.
Encode(Tensor<T>)
Encodes input using the momentum encoder (no gradient computation).
Tensor<T> Encode(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to encode.
Returns
- Tensor<T>
The encoded representation (detached from computation graph).
Remarks
For Beginners: The momentum encoder output is treated as a fixed target - gradients don't flow through it. Only the main encoder is trained directly.
GetParameters()
Gets all parameters of the momentum encoder.
Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all parameters.
SetMomentum(double)
Sets the momentum coefficient.
void SetMomentum(double momentum)
Parameters
momentumdoubleNew momentum value (0-1, typically 0.99-0.9999).
Remarks
Some methods schedule momentum during training (e.g., BYOL increases from 0.996 to 1.0).
SetParameters(Vector<T>)
Sets the parameters of the momentum encoder directly.
void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>The parameters to set.
UpdateFromMainEncoder(INeuralNetwork<T>)
Updates the momentum encoder parameters using EMA from the main encoder.
void UpdateFromMainEncoder(INeuralNetwork<T> mainEncoder)
Parameters
mainEncoderINeuralNetwork<T>The main encoder to copy parameters from.
Remarks
For Beginners: Call this after each training step. The momentum encoder slowly tracks the main encoder, providing stable targets.
Update formula:
θ_momentum = m * θ_momentum + (1 - m) * θ_main
UpdateFromParameters(Vector<T>)
Updates the momentum encoder parameters using EMA from parameter vectors.
void UpdateFromParameters(Vector<T> mainEncoderParams)
Parameters
mainEncoderParamsVector<T>Parameters from the main encoder.