Class DINO<T>

Namespace: AiDotNet.SelfSupervisedLearning

Assembly: AiDotNet.dll

DINO: Self-Distillation with No Labels - a self-supervised method for Vision Transformers.

public class DINO<T> : TeacherStudentSSL<T>, ISSLMethod<T>

Type Parameters

T: The numeric type used for computations.

Inheritance: object

SSLMethodBase<T>

TeacherStudentSSL<T>

DINO<T>

Implements: ISSLMethod<T>

Inherited Members: TeacherStudentSSL<T>.TeacherEncoder

TeacherStudentSSL<T>.TeacherProjector

TeacherStudentSSL<T>.Centering

TeacherStudentSSL<T>.BaseMomentum

TeacherStudentSSL<T>.Augmentation

TeacherStudentSSL<T>.NumGlobalCrops

TeacherStudentSSL<T>.NumLocalCrops

TeacherStudentSSL<T>.UsesMomentumEncoder

TeacherStudentSSL<T>.RequiresMemoryBank

TeacherStudentSSL<T>.CreateMultiCropViews(Tensor<T>)

TeacherStudentSSL<T>.ForwardStudent(Tensor<T>)

TeacherStudentSSL<T>.ForwardTeacher(Tensor<T>)

TeacherStudentSSL<T>.UpdateTeacher()

TeacherStudentSSL<T>.UpdateStudent(T)

TeacherStudentSSL<T>.OnEpochStart(int)

TeacherStudentSSL<T>.GetAdditionalParameterCount()

TeacherStudentSSL<T>.GetAdditionalParameters()

SSLMethodBase<T>.NumOps

SSLMethodBase<T>.Engine

SSLMethodBase<T>._encoder

SSLMethodBase<T>._projector

SSLMethodBase<T>._config

SSLMethodBase<T>._isTraining

SSLMethodBase<T>._currentStep

SSLMethodBase<T>._currentEpoch

SSLMethodBase<T>.ParameterCount

SSLMethodBase<T>.GetEncoder()

SSLMethodBase<T>.TrainStep(Tensor<T>, SSLAugmentationContext<T>)

SSLMethodBase<T>.Encode(Tensor<T>)

SSLMethodBase<T>.EncodeAndProject(Tensor<T>)

SSLMethodBase<T>.Reset()

SSLMethodBase<T>.GetParameters()

SSLMethodBase<T>.SetParameters(Vector<T>)

SSLMethodBase<T>.SetAdditionalParameters(Vector<T>, ref int)

SSLMethodBase<T>.SetTrainingMode(bool)

SSLMethodBase<T>.OnEpochEnd(int)

SSLMethodBase<T>.GetEffectiveTemperature()

SSLMethodBase<T>.GetEffectiveLearningRate()

SSLMethodBase<T>.CreateStepResult(T)

SSLMethodBase<T>.CosineSimilarity(Tensor<T>, Tensor<T>)

SSLMethodBase<T>.L2Normalize(Tensor<T>)

SSLMethodBase<T>.MatMul(Tensor<T>, Tensor<T>)

SSLMethodBase<T>.ComputeSimilarityMatrix(Tensor<T>, Tensor<T>, bool)

SSLMethodBase<T>.ComputePairwiseDistances(Tensor<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

For Beginners: DINO is a self-supervised method specifically designed for Vision Transformers (ViT). It learns by having a student network predict the output of a teacher network, where the teacher is an EMA of the student.

Key innovations:

Self-distillation: Student learns from teacher's soft labels
Centering and sharpening: Prevents collapse without negative samples
Multi-crop training: Uses global and local crops for efficiency
Emergent properties: Learns features that segment objects without supervision

Architecture:

Global views → Teacher → Softmax(z/τ_t - center) → P_t
All views → Student → Softmax(z/τ_s) → P_s
Loss: Cross-entropy(P_s, P_t)

Reference: Caron et al., "Emerging Properties in Self-Supervised Vision Transformers" (ICCV 2021)

Constructors

DINO(INeuralNetwork<T>, IMomentumEncoder<T>, IProjectorHead<T>, IProjectorHead<T>, int, SSLConfig?)

Initializes a new instance of the DINO class.

public DINO(INeuralNetwork<T> studentEncoder, IMomentumEncoder<T> teacherEncoder, IProjectorHead<T> studentProjector, IProjectorHead<T> teacherProjector, int outputDim = 65536, SSLConfig? config = null)

Parameters

studentEncoder INeuralNetwork<T>: The student encoder (ViT recommended).
teacherEncoder IMomentumEncoder<T>: The teacher encoder (momentum-updated copy).
studentProjector IProjectorHead<T>: Projection head for student.
teacherProjector IProjectorHead<T>: Projection head for teacher.
outputDim int: Output dimension of the projection heads.
config SSLConfig: Optional SSL configuration.

Properties

Name

Gets the name of this SSL method.

public override string Name { get; }

Property Value

string

Remarks

Examples: "SimCLR", "MoCo v2", "BYOL", "DINO", "MAE"

Methods

Create(INeuralNetwork<T>, Func<INeuralNetwork<T>, INeuralNetwork<T>>, int, int, int, int)

Creates a DINO instance with default configuration.

public static DINO<T> Create(INeuralNetwork<T> encoder, Func<INeuralNetwork<T>, INeuralNetwork<T>> createEncoderCopy, int encoderOutputDim, int projectionDim = 256, int hiddenDim = 2048, int outputDim = 65536)

Parameters

encoder INeuralNetwork<T>: The backbone encoder (ViT recommended).
createEncoderCopy Func<INeuralNetwork<T>, INeuralNetwork<T>>: Function to create a copy of the encoder for teacher.
encoderOutputDim int: Output dimension of the encoder.
projectionDim int: Dimension of the projection space (default: 256).
hiddenDim int: Hidden dimension of the projector MLP (default: 2048).
outputDim int: Output dimension for softmax (default: 65536).

Returns

DINO<T>: A configured DINO instance.

TrainStepCore(Tensor<T>, SSLAugmentationContext<T>?)

Implementation-specific training step logic.

protected override SSLStepResult<T> TrainStepCore(Tensor<T> batch, SSLAugmentationContext<T>? augmentationContext)

Parameters

batch Tensor<T>: The input batch tensor.
augmentationContext SSLAugmentationContext<T>: Optional augmentation context.

Returns

SSLStepResult<T>: The result of the training step.

Table of Contents

Class DINO<T>

Type Parameters

Remarks

Constructors

DINO(INeuralNetwork<T>, IMomentumEncoder<T>, IProjectorHead<T>, IProjectorHead<T>, int, SSLConfig?)

Parameters

Properties

Category

Property Value

Remarks

Name

Property Value

Remarks

Methods

Create(INeuralNetwork<T>, Func<INeuralNetwork<T>, INeuralNetwork<T>>, int, int, int, int)

Parameters

Returns

TrainStepCore(Tensor<T>, SSLAugmentationContext<T>?)

Parameters

Returns