Class QuantizedTeacherModel<T>

Namespace: AiDotNet.KnowledgeDistillation.Teachers

Assembly: AiDotNet.dll

Quantized teacher model with reduced precision for efficient deployment.

public class QuantizedTeacherModel<T> : TeacherModelBase<Vector<T>, Vector<T>, T>, ITeacherModel<Vector<T>, Vector<T>>, IJitCompilable<T>

Type Parameters

T

Inheritance: object

TeacherModelBase<Vector<T>, Vector<T>, T>

QuantizedTeacherModel<T>

Implements: ITeacherModel<Vector<T>, Vector<T>>

IJitCompilable<T>

Inherited Members: TeacherModelBase<Vector<T>, Vector<T>, T>.NumOps

TeacherModelBase<Vector<T>, Vector<T>, T>.OutputDimension

TeacherModelBase<Vector<T>, Vector<T>, T>.GetLogits(Vector<T>)

TeacherModelBase<Vector<T>, Vector<T>, T>.SupportsJitCompilation

TeacherModelBase<Vector<T>, Vector<T>, T>.ExportComputationGraph(List<ComputationNode<T>>)

TeacherModelBase<Vector<T>, Vector<T>, T>.CheckWrappedModelJitSupport(ITeacherModel<Vector<T>, Vector<T>>)

TeacherModelBase<Vector<T>, Vector<T>, T>.DelegateJitExport(ITeacherModel<Vector<T>, Vector<T>>, List<ComputationNode<T>>, string)

TeacherModelBase<Vector<T>, Vector<T>, T>.ThrowJitNotSupported(string, string)

TeacherModelBase<Vector<T>, Vector<T>, T>.ValidateInput(Vector<T>, string)

TeacherModelBase<Vector<T>, Vector<T>, T>.Softmax(Vector<T>, double)

TeacherModelBase<Vector<T>, Vector<T>, T>.ApplyTemperatureSoftmax(Vector<T>, double)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

For Beginners: Quantization reduces the numerical precision of model weights and activations to use fewer bits (e.g., 8-bit instead of 32-bit floating point). This enables:

Smaller model size
Faster inference on hardware with integer support
Reduced memory bandwidth requirements

JIT Support: When constructed with an IJitCompilable base model, this teacher supports JIT compilation using FakeQuantization with Straight-Through Estimator (STE). This allows the quantized model to be differentiated during training while simulating quantization effects.

Constructors

QuantizedTeacherModel(IJitCompilable<T>, int, int, T?, T?, bool)

Initializes a new instance of QuantizedTeacherModel wrapping a JIT-compilable model.

public QuantizedTeacherModel(IJitCompilable<T> jitCompilableBase, int outputDimension, int quantizationBits = 8, T? scale = default, T? zeroPoint = default, bool symmetric = true)

Parameters

jitCompilableBase IJitCompilable<T>: The JIT-compilable base model to quantize.
outputDimension int: Output dimension of the model.
quantizationBits int: Number of bits for quantization (1-32).
scale T: Scale factor for quantization. If default, uses 1/(2^(bits-1)).
zeroPoint T: Zero point for asymmetric quantization. Default is 0.
symmetric bool: Whether to use symmetric quantization (centered at 0).

Remarks

JIT Support: This constructor enables JIT compilation using FakeQuantization with Straight-Through Estimator (STE). The scale and zero point are fixed at construction time, allowing the graph to be statically compiled.

Symmetric vs Asymmetric:

Symmetric: Range is [-max, max], zero point is 0. Good for weights.
Asymmetric: Range is [min, max], zero point may be non-zero. Good for activations with bias.

QuantizedTeacherModel(ITeacherModel<Vector<T>, Vector<T>>, int)

Initializes a new instance of QuantizedTeacherModel wrapping a teacher interface.

public QuantizedTeacherModel(ITeacherModel<Vector<T>, Vector<T>> baseTeacher, int quantizationBits = 8)

Parameters

baseTeacher ITeacherModel<Vector<T>, Vector<T>>: The base teacher model to quantize.
quantizationBits int: Number of bits for quantization (1-32).

Remarks

This constructor uses dynamic quantization (per-batch min/max finding) which does not support JIT compilation. Use the constructor with IJitCompilable for JIT support.

Properties

OutputDimension

Gets the output dimension of the teacher model.

public override int OutputDimension { get; }

Property Value

int

SupportsJitCompilation

Gets whether this teacher supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool: true if constructed with an IJitCompilable model that supports JIT; false if using dynamic quantization with runtime min/max finding.

Methods

ExportComputationGraph(List<ComputationNode<T>>)

Exports the computation graph for JIT compilation with FakeQuantization.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>: List to populate with input nodes.

Returns

ComputationNode<T>: The output computation node with quantization applied.

Remarks

When constructed with an IJitCompilable model, this method exports the base model's computation graph and wraps the output with a FakeQuantization operation. The FakeQuantization uses Straight-Through Estimator (STE) for gradients, allowing backpropagation through the quantization operation.

When using dynamic quantization (per-batch min/max), JIT compilation is not supported because the quantization parameters are computed at runtime.

Exceptions

NotSupportedException: Thrown when using dynamic quantization mode.

GetLogits(Vector<T>)

Gets quantized logits from the teacher model.

public override Vector<T> GetLogits(Vector<T> input)

Parameters

input Vector<T>: Input to the model.

Returns

Vector<T>: Quantized logits.

Table of Contents

Class QuantizedTeacherModel<T>

Type Parameters

Remarks

Constructors

QuantizedTeacherModel(IJitCompilable<T>, int, int, T?, T?, bool)

Parameters

Remarks

QuantizedTeacherModel(ITeacherModel<Vector<T>, Vector<T>>, int)

Parameters

Remarks

Properties

OutputDimension

Property Value

SupportsJitCompilation

Property Value

Methods

ExportComputationGraph(List<ComputationNode<T>>)

Parameters

Returns

Remarks

Exceptions

GetLogits(Vector<T>)

Parameters

Returns