Class MomentumEncoder<T>
- Namespace
- AiDotNet.SelfSupervisedLearning
- Assembly
- AiDotNet.dll
Momentum-updated encoder for self-supervised learning methods.
public class MomentumEncoder<T> : IMomentumEncoder<T>
Type Parameters
TThe numeric type used for computations (typically float or double).
- Inheritance
-
MomentumEncoder<T>
- Implements
- Inherited Members
Remarks
For Beginners: A momentum encoder is a copy of the main encoder that updates more slowly using exponential moving average (EMA). This provides stable, consistent targets during self-supervised training.
Update formula:
θ_momentum = m * θ_momentum + (1 - m) * θ_main
Where m is momentum (typically 0.99-0.9999).
Why slow updates?
- Provides stable targets that don't change rapidly
- Prevents collapse in methods like BYOL
- Ensures consistent embeddings in memory bank (MoCo)
Example usage:
// Initialize with copy of main encoder
var momentumEncoder = new MomentumEncoder<float>(mainEncoder.Clone(), momentum: 0.999);
// Training loop:
var targets = momentumEncoder.Encode(augmentedBatch2); // Get targets
// ... compute loss with main encoder output ...
momentumEncoder.UpdateFromMainEncoder(mainEncoder); // EMA update
Constructors
MomentumEncoder(INeuralNetwork<T>, double)
Initializes a new instance of the MomentumEncoder class.
public MomentumEncoder(INeuralNetwork<T> encoder, double momentum = 0.999)
Parameters
encoderINeuralNetwork<T>The encoder network (should be a copy/clone of the main encoder).
momentumdoubleInitial momentum coefficient (0-1, typically 0.99-0.9999).
Properties
Encoder
Gets the underlying momentum-updated encoder network.
public INeuralNetwork<T> Encoder { get; }
Property Value
Momentum
Gets the momentum coefficient for EMA updates.
public double Momentum { get; }
Property Value
Remarks
Typical values: 0.99-0.9999. Higher values = slower updates = more stable targets.
MoCo uses 0.999, BYOL uses 0.996 → 1.0 (scheduled).
Methods
CopyFromMainEncoder(INeuralNetwork<T>)
Copies all parameters from the main encoder (hard copy, not EMA).
public void CopyFromMainEncoder(INeuralNetwork<T> mainEncoder)
Parameters
mainEncoderINeuralNetwork<T>The encoder to copy from.
Remarks
Used for initialization at the start of training.
Create<TEncoder>(TEncoder, double, Func<TEncoder, TEncoder>)
Creates a momentum encoder from a main encoder by cloning.
public static MomentumEncoder<T> Create<TEncoder>(TEncoder mainEncoder, double momentum, Func<TEncoder, TEncoder> cloneFunc) where TEncoder : INeuralNetwork<T>
Parameters
mainEncoderTEncoderThe main encoder to clone.
momentumdoubleInitial momentum coefficient.
cloneFuncFunc<TEncoder, TEncoder>Function to clone the encoder.
Returns
- MomentumEncoder<T>
A new momentum encoder wrapping a cloned encoder.
Type Parameters
TEncoderThe specific encoder type.
Encode(Tensor<T>)
Encodes input using the momentum encoder (no gradient computation).
public Tensor<T> Encode(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to encode.
Returns
- Tensor<T>
The encoded representation (detached from computation graph).
Remarks
For Beginners: The momentum encoder output is treated as a fixed target - gradients don't flow through it. Only the main encoder is trained directly.
GetParameters()
Gets all parameters of the momentum encoder.
public Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all parameters.
ScheduleMomentum(double, double, int, int)
Computes the scheduled momentum value based on training progress.
public static double ScheduleMomentum(double baseMomentum, double finalMomentum, int currentEpoch, int totalEpochs)
Parameters
baseMomentumdoubleStarting momentum value.
finalMomentumdoubleFinal momentum value.
currentEpochintCurrent training epoch.
totalEpochsintTotal training epochs.
Returns
- double
The scheduled momentum value.
Remarks
For Beginners: Some methods like BYOL schedule momentum to increase during training. This typically uses a cosine schedule from base to final momentum.
SetMomentum(double)
Sets the momentum coefficient.
public void SetMomentum(double momentum)
Parameters
momentumdoubleNew momentum value (0-1, typically 0.99-0.9999).
Remarks
Some methods schedule momentum during training (e.g., BYOL increases from 0.996 to 1.0).
SetParameters(Vector<T>)
Sets the parameters of the momentum encoder directly.
public void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>The parameters to set.
UpdateFromMainEncoder(INeuralNetwork<T>)
Updates the momentum encoder parameters using EMA from the main encoder.
public void UpdateFromMainEncoder(INeuralNetwork<T> mainEncoder)
Parameters
mainEncoderINeuralNetwork<T>The main encoder to copy parameters from.
Remarks
For Beginners: Call this after each training step. The momentum encoder slowly tracks the main encoder, providing stable targets.
Update formula:
θ_momentum = m * θ_momentum + (1 - m) * θ_main
UpdateFromParameters(Vector<T>)
Updates the momentum encoder parameters using EMA from parameter vectors.
public void UpdateFromParameters(Vector<T> mainEncoderParams)
Parameters
mainEncoderParamsVector<T>Parameters from the main encoder.