Class MLPProjector<T>
- Namespace
- AiDotNet.SelfSupervisedLearning
- Assembly
- AiDotNet.dll
Multi-layer perceptron (MLP) projection head for self-supervised learning.
public class MLPProjector<T> : IProjectorHead<T>
Type Parameters
TThe numeric type used for computations (typically float or double).
- Inheritance
-
MLPProjector<T>
- Implements
- Inherited Members
Remarks
For Beginners: An MLP projector transforms encoder outputs into a lower-dimensional space optimized for the SSL loss. This is the standard projector used in SimCLR, MoCo v2, BYOL.
Architecture:
Input → Linear → BatchNorm → ReLU → Linear → [BatchNorm] → Output
[d_in] [d_hid] [d_hid] [d_out] [d_out]
Why MLP over Linear?
- Non-linearity allows learning more complex projections
- Extra capacity prevents encoder from being constrained by SSL loss
- Empirically shown to significantly improve downstream performance
Constructors
MLPProjector(int, int, int, bool, int?)
Initializes a new instance of the MLPProjector class.
public MLPProjector(int inputDim, int hiddenDim = 2048, int outputDim = 128, bool useBatchNormOnOutput = false, int? seed = null)
Parameters
inputDimintInput dimension (encoder output size).
hiddenDimintHidden dimension (typically 2048-4096).
outputDimintOutput dimension (typically 128-256).
useBatchNormOnOutputboolWhether to apply BatchNorm on the output layer.
seedint?Optional random seed for initialization.
Properties
Engine
Gets the global execution engine for vector operations and GPU/CPU acceleration.
protected IEngine Engine { get; }
Property Value
- IEngine
HiddenDimension
Gets the hidden dimension (for MLP projectors).
public int? HiddenDimension { get; }
Property Value
- int?
Remarks
Typical values: 2048-4096. Usually larger than output dimension.
InputDimension
Gets the input dimension expected by this projector.
public int InputDimension { get; }
Property Value
OutputDimension
Gets the output dimension produced by this projector.
public int OutputDimension { get; }
Property Value
Remarks
Typical values: 128-2048. SimCLR uses 128, MoCo uses 128, BYOL uses 256.
ParameterCount
Gets the total number of trainable parameters.
public int ParameterCount { get; }
Property Value
Methods
Backward(Tensor<T>)
Performs the backward pass through the projector.
public Tensor<T> Backward(Tensor<T> gradients)
Parameters
gradientsTensor<T>The gradients from the loss with respect to projector output.
Returns
- Tensor<T>
The gradients with respect to projector input (for encoder backprop).
ClearGradients()
Clears accumulated gradients.
public void ClearGradients()
GetParameterGradients()
Gets the gradients computed during the last backward pass.
public Vector<T> GetParameterGradients()
Returns
- Vector<T>
A vector containing gradients for all parameters.
GetParameters()
Gets all trainable parameters of the projector.
public Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all parameters.
Project(Tensor<T>)
Projects encoder output to the SSL embedding space.
public Tensor<T> Project(Tensor<T> input)
Parameters
inputTensor<T>The encoder output tensor.
Returns
- Tensor<T>
The projected embedding tensor.
Remarks
For Beginners: This transforms encoder features into a lower-dimensional space where the SSL loss is computed. The projection helps separate the pretraining objective from the learned representations.
Reset()
Resets the projector state (clears any internal buffers).
public void Reset()
SetParameters(Vector<T>)
Sets the parameters of the projector.
public void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>The parameter vector to load.
SetTrainingMode(bool)
Sets training or evaluation mode.
public void SetTrainingMode(bool isTraining)
Parameters
isTrainingboolTrue for training mode, false for evaluation.