Class DINO<T>
- Namespace
- AiDotNet.SelfSupervisedLearning
- Assembly
- AiDotNet.dll
DINO: Self-Distillation with No Labels - a self-supervised method for Vision Transformers.
public class DINO<T> : TeacherStudentSSL<T>, ISSLMethod<T>
Type Parameters
TThe numeric type used for computations.
- Inheritance
-
DINO<T>
- Implements
-
ISSLMethod<T>
- Inherited Members
Remarks
For Beginners: DINO is a self-supervised method specifically designed for Vision Transformers (ViT). It learns by having a student network predict the output of a teacher network, where the teacher is an EMA of the student.
Key innovations:
- Self-distillation: Student learns from teacher's soft labels
- Centering and sharpening: Prevents collapse without negative samples
- Multi-crop training: Uses global and local crops for efficiency
- Emergent properties: Learns features that segment objects without supervision
Architecture:
Global views → Teacher → Softmax(z/τ_t - center) → P_t
All views → Student → Softmax(z/τ_s) → P_s
Loss: Cross-entropy(P_s, P_t)
Reference: Caron et al., "Emerging Properties in Self-Supervised Vision Transformers" (ICCV 2021)
Constructors
DINO(INeuralNetwork<T>, IMomentumEncoder<T>, IProjectorHead<T>, IProjectorHead<T>, int, SSLConfig?)
Initializes a new instance of the DINO class.
public DINO(INeuralNetwork<T> studentEncoder, IMomentumEncoder<T> teacherEncoder, IProjectorHead<T> studentProjector, IProjectorHead<T> teacherProjector, int outputDim = 65536, SSLConfig? config = null)
Parameters
studentEncoderINeuralNetwork<T>The student encoder (ViT recommended).
teacherEncoderIMomentumEncoder<T>The teacher encoder (momentum-updated copy).
studentProjectorIProjectorHead<T>Projection head for student.
teacherProjectorIProjectorHead<T>Projection head for teacher.
outputDimintOutput dimension of the projection heads.
configSSLConfigOptional SSL configuration.
Properties
Category
Gets the category of this SSL method.
public override SSLMethodCategory Category { get; }
Property Value
Remarks
Categories include Contrastive, NonContrastive, Generative, and SelfDistillation.
Name
Gets the name of this SSL method.
public override string Name { get; }
Property Value
Remarks
Examples: "SimCLR", "MoCo v2", "BYOL", "DINO", "MAE"
Methods
Create(INeuralNetwork<T>, Func<INeuralNetwork<T>, INeuralNetwork<T>>, int, int, int, int)
Creates a DINO instance with default configuration.
public static DINO<T> Create(INeuralNetwork<T> encoder, Func<INeuralNetwork<T>, INeuralNetwork<T>> createEncoderCopy, int encoderOutputDim, int projectionDim = 256, int hiddenDim = 2048, int outputDim = 65536)
Parameters
encoderINeuralNetwork<T>The backbone encoder (ViT recommended).
createEncoderCopyFunc<INeuralNetwork<T>, INeuralNetwork<T>>Function to create a copy of the encoder for teacher.
encoderOutputDimintOutput dimension of the encoder.
projectionDimintDimension of the projection space (default: 256).
hiddenDimintHidden dimension of the projector MLP (default: 2048).
outputDimintOutput dimension for softmax (default: 65536).
Returns
- DINO<T>
A configured DINO instance.
TrainStepCore(Tensor<T>, SSLAugmentationContext<T>?)
Implementation-specific training step logic.
protected override SSLStepResult<T> TrainStepCore(Tensor<T> batch, SSLAugmentationContext<T>? augmentationContext)
Parameters
batchTensor<T>The input batch tensor.
augmentationContextSSLAugmentationContext<T>Optional augmentation context.
Returns
- SSLStepResult<T>
The result of the training step.