Class iBOT<T>

Namespace: AiDotNet.SelfSupervisedLearning

Assembly: AiDotNet.dll

iBOT: Image BERT Pre-Training with Online Tokenizer - combining DINO with masked image modeling.

public class iBOT<T> : TeacherStudentSSL<T>, ISSLMethod<T>

Type Parameters

T: The numeric type used for computations.

Inheritance: object

SSLMethodBase<T>

TeacherStudentSSL<T>

iBOT<T>

Implements: ISSLMethod<T>

Inherited Members: TeacherStudentSSL<T>.TeacherEncoder

TeacherStudentSSL<T>.TeacherProjector

TeacherStudentSSL<T>.Centering

TeacherStudentSSL<T>.BaseMomentum

TeacherStudentSSL<T>.Augmentation

TeacherStudentSSL<T>.NumGlobalCrops

TeacherStudentSSL<T>.NumLocalCrops

TeacherStudentSSL<T>.UsesMomentumEncoder

TeacherStudentSSL<T>.RequiresMemoryBank

TeacherStudentSSL<T>.CreateMultiCropViews(Tensor<T>)

TeacherStudentSSL<T>.ForwardStudent(Tensor<T>)

TeacherStudentSSL<T>.ForwardTeacher(Tensor<T>)

TeacherStudentSSL<T>.UpdateTeacher()

TeacherStudentSSL<T>.UpdateStudent(T)

TeacherStudentSSL<T>.OnEpochStart(int)

TeacherStudentSSL<T>.GetAdditionalParameterCount()

TeacherStudentSSL<T>.GetAdditionalParameters()

SSLMethodBase<T>.NumOps

SSLMethodBase<T>.Engine

SSLMethodBase<T>._encoder

SSLMethodBase<T>._projector

SSLMethodBase<T>._config

SSLMethodBase<T>._isTraining

SSLMethodBase<T>._currentStep

SSLMethodBase<T>._currentEpoch

SSLMethodBase<T>.ParameterCount

SSLMethodBase<T>.GetEncoder()

SSLMethodBase<T>.TrainStep(Tensor<T>, SSLAugmentationContext<T>)

SSLMethodBase<T>.Encode(Tensor<T>)

SSLMethodBase<T>.EncodeAndProject(Tensor<T>)

SSLMethodBase<T>.Reset()

SSLMethodBase<T>.GetParameters()

SSLMethodBase<T>.SetParameters(Vector<T>)

SSLMethodBase<T>.SetAdditionalParameters(Vector<T>, ref int)

SSLMethodBase<T>.SetTrainingMode(bool)

SSLMethodBase<T>.OnEpochEnd(int)

SSLMethodBase<T>.GetEffectiveTemperature()

SSLMethodBase<T>.GetEffectiveLearningRate()

SSLMethodBase<T>.CreateStepResult(T)

SSLMethodBase<T>.CosineSimilarity(Tensor<T>, Tensor<T>)

SSLMethodBase<T>.L2Normalize(Tensor<T>)

SSLMethodBase<T>.MatMul(Tensor<T>, Tensor<T>)

SSLMethodBase<T>.ComputeSimilarityMatrix(Tensor<T>, Tensor<T>, bool)

SSLMethodBase<T>.ComputePairwiseDistances(Tensor<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

For Beginners: iBOT combines the best of DINO (self-distillation) with masked image modeling (like MAE). It masks patches in the student view and predicts both the CLS token (like DINO) and the masked patches (like BERT for images).

Key innovations:

Dual objective: CLS token distillation + masked patch prediction
Online tokenizer: Uses teacher to provide targets for masked patches
Shared architecture: Single network handles both objectives
Better representations: Combines global (CLS) and local (patch) learning

Loss formula:

L = L_cls (DINO loss on CLS token) + λ * L_mim (masked patch prediction)

Reference: Zhou et al., "iBOT: Image BERT Pre-Training with Online Tokenizer" (ICLR 2022)

Constructors

iBOT(INeuralNetwork<T>, IMomentumEncoder<T>, IProjectorHead<T>, IProjectorHead<T>, int, double, double, SSLConfig?)

Initializes a new instance of the iBOT class.

public iBOT(INeuralNetwork<T> studentEncoder, IMomentumEncoder<T> teacherEncoder, IProjectorHead<T> studentProjector, IProjectorHead<T> teacherProjector, int outputDim = 8192, double mimWeight = 1, double maskRatio = 0.4, SSLConfig? config = null)

Parameters

studentEncoder INeuralNetwork<T>: The student encoder (ViT required).
teacherEncoder IMomentumEncoder<T>: The teacher encoder (momentum-updated copy).
studentProjector IProjectorHead<T>: Projection head for student.
teacherProjector IProjectorHead<T>: Projection head for teacher.
outputDim int: Output dimension of the projection heads.
mimWeight double: Weight for masked image modeling loss (default: 1.0).
maskRatio double: Ratio of patches to mask (default: 0.4).
config SSLConfig: Optional SSL configuration.

Properties

MIMWeight

Gets the weight for masked image modeling loss.

public double MIMWeight { get; }

Property Value

double

MaskRatio

Gets the mask ratio for patches.

public double MaskRatio { get; }

Property Value

double

Name

Gets the name of this SSL method.

public override string Name { get; }

Property Value

string

Remarks

Examples: "SimCLR", "MoCo v2", "BYOL", "DINO", "MAE"

Methods

Create(INeuralNetwork<T>, Func<INeuralNetwork<T>, INeuralNetwork<T>>, int, int, int, double, double)

Creates an iBOT instance with default configuration.

public static iBOT<T> Create(INeuralNetwork<T> encoder, Func<INeuralNetwork<T>, INeuralNetwork<T>> createEncoderCopy, int encoderOutputDim, int outputDim = 8192, int hiddenDim = 2048, double mimWeight = 1, double maskRatio = 0.4)

Parameters

encoder INeuralNetwork<T>
createEncoderCopy Func<INeuralNetwork<T>, INeuralNetwork<T>>
encoderOutputDim int
outputDim int
hiddenDim int
mimWeight double
maskRatio double

Returns

iBOT<T>

TrainStepCore(Tensor<T>, SSLAugmentationContext<T>?)

Implementation-specific training step logic.

protected override SSLStepResult<T> TrainStepCore(Tensor<T> batch, SSLAugmentationContext<T>? augmentationContext)

Parameters

batch Tensor<T>: The input batch tensor.
augmentationContext SSLAugmentationContext<T>: Optional augmentation context.

Returns

SSLStepResult<T>: The result of the training step.

Table of Contents

Class iBOT<T>

Type Parameters

Remarks

Constructors

iBOT(INeuralNetwork<T>, IMomentumEncoder<T>, IProjectorHead<T>, IProjectorHead<T>, int, double, double, SSLConfig?)

Parameters

Properties

Category

Property Value

Remarks

MIMWeight

Property Value

MaskRatio

Property Value

Name

Property Value

Remarks

Methods

Create(INeuralNetwork<T>, Func<INeuralNetwork<T>, INeuralNetwork<T>>, int, int, int, double, double)

Parameters

Returns

TrainStepCore(Tensor<T>, SSLAugmentationContext<T>?)

Parameters

Returns