Class SEALOptions<T, TInput, TOutput>

Namespace: AiDotNet.MetaLearning.Options

Assembly: AiDotNet.dll

Configuration options for the SEAL (Sample-Efficient Adaptive Learning) meta-learning algorithm.

public class SEALOptions<T, TInput, TOutput> : IMetaLearnerOptions<T>

Type Parameters

T: The numeric data type used for calculations (e.g., float, double).
TInput: The input data type (e.g., Matrix<T>, Tensor<T>).
TOutput: The output data type (e.g., Vector<T>, Tensor<T>).

Inheritance: object

SEALOptions<T, TInput, TOutput>

Implements: IMetaLearnerOptions<T>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

SEAL is a gradient-based meta-learning algorithm that combines ideas from MAML with sample-efficiency improvements. It learns initial parameters that can be quickly adapted to new tasks with just a few examples, incorporating temperature scaling, entropy regularization, and optional adaptive learning rates.

For Beginners: SEAL learns the best starting point for a model so that it can quickly adapt to new tasks with minimal data. Think of it like learning how to learn - after seeing many tasks, the model knows how to pick up new skills quickly.

Imagine learning to play musical instruments:

Learning your first instrument (piano) is hard
Learning your second instrument (guitar) is easier
By your 5th instrument, you've learned principles that help you pick up any new instrument much faster

SEAL does the same with machine learning models!

Key features of SEAL: - Temperature scaling: Controls confidence in predictions during meta-training - Entropy regularization: Encourages diverse predictions to prevent overconfident models - Adaptive learning rates: Per-parameter learning rate adaptation based on gradient norms - Weight decay: Prevents overfitting to meta-training tasks

Constructors

SEALOptions(IFullModel<T, TInput, TOutput>)

Initializes a new instance of the SEALOptions class with the required meta-model.

public SEALOptions(IFullModel<T, TInput, TOutput> metaModel)

Parameters

metaModel IFullModel<T, TInput, TOutput>: The meta-model to be trained (required).

Examples

// Create SEAL options with minimal configuration (uses all defaults)
var options = new SEALOptions<double, Matrix<double>, Vector<double>>(myNeuralNetwork);
var seal = new SEALAlgorithm<double, Matrix<double>, Vector<double>>(options);

// Create SEAL with entropy regularization for better generalization
var options = new SEALOptions<double, Matrix<double>, Vector<double>>(myNeuralNetwork)
{
    EntropyCoefficient = 0.01,
    Temperature = 1.5,
    UseAdaptiveInnerLR = true
};

// Create SEAL with weight decay and gradient clipping
var options = new SEALOptions<double, Matrix<double>, Vector<double>>(myNeuralNetwork)
{
    WeightDecay = 0.001,
    GradientClipThreshold = 5.0,
    AdaptationSteps = 10
};

Exceptions

ArgumentNullException: Thrown when metaModel is null.

Properties

AdaptationSteps

Gets or sets the number of gradient steps to take during inner loop adaptation.

public int AdaptationSteps { get; set; }

Property Value

int: Default: 5 (standard for few-shot learning scenarios).

Remarks

More adaptation steps allow better task-specific fitting but increase computational cost and memory usage (especially for second-order methods).

For Beginners: This is how many practice rounds you get on each task before being tested. More practice usually helps, but takes more time.

AdaptiveLearningRateDecay

Gets or sets the decay rate for running mean in adaptive learning rates.

public double AdaptiveLearningRateDecay { get; set; }

Property Value

double: Default: 0.99 (slow decay for stable estimates).

Remarks

Only used when AdaptiveLearningRateMode is RunningMean. Higher values give more weight to historical gradients (more stable but slower to adapt).

AdaptiveLearningRateEpsilon

Gets or sets the epsilon value for numerical stability in adaptive learning rates.

public double AdaptiveLearningRateEpsilon { get; set; }

Property Value

double: Default: 1e-8 (small value to prevent division by zero).

Remarks

Used in the denominator when computing adaptive learning rates to prevent division by zero: lr = base_lr / (sqrt(squared_grad) + epsilon)

AdaptiveLearningRateMode

Gets or sets the mode for adaptive learning rate computation.

public SEALAdaptiveLearningRateMode AdaptiveLearningRateMode { get; set; }

Property Value

SEALAdaptiveLearningRateMode: Default: GradientNorm (scale by inverse gradient norm).

Remarks

Different modes for computing adaptive learning rates: - GradientNorm: lr = base_lr / (sqrt(grad^2) + epsilon) [AdaGrad-like] - RunningMean: Maintains exponential moving average of squared gradients [RMSprop-like] - PerLayer: Applies same adaptive rate to all parameters in each layer

CheckpointFrequency

Gets or sets how often to save checkpoints.

public int CheckpointFrequency { get; set; }

Property Value

int: Default: 500.

DataLoader

Gets or sets the episodic data loader for sampling tasks.

public IEpisodicDataLoader<T, TInput, TOutput>? DataLoader { get; set; }

Property Value

IEpisodicDataLoader<T, TInput, TOutput>: Default: null (tasks must be provided manually to MetaTrain).

EnableCheckpointing

Gets or sets whether to save checkpoints during training.

public bool EnableCheckpointing { get; set; }

Property Value

bool: Default: false.

EntropyCoefficient

Gets or sets the entropy regularization coefficient.

public double EntropyCoefficient { get; set; }

Property Value

double: Default: 0.0 (no entropy regularization).

Remarks

Entropy regularization adds a bonus for diverse predictions: Loss = Original_Loss - EntropyCoefficient * Entropy(predictions)

Higher values encourage the model to be less confident and more exploratory, which can help prevent overfitting on meta-training tasks.

For Beginners: Entropy measures how "spread out" the model's predictions are. By adding entropy to the objective, we encourage the model to not be too confident, which helps it generalize better to new tasks.

EntropyOnlyDuringMetaTrain

Gets or sets whether to apply entropy regularization only during meta-training.

public bool EntropyOnlyDuringMetaTrain { get; set; }

Property Value

bool: Default: true (entropy regularization disabled during adaptation).

Remarks

When true, entropy regularization is only applied during meta-training (outer loop) and not during task adaptation (inner loop). This is often desirable as we want focused adaptation during the inner loop.

EvaluationFrequency

Gets or sets how often to evaluate during meta-training.

public int EvaluationFrequency { get; set; }

Property Value

int: Default: 100 (evaluate every 100 iterations).

EvaluationTasks

Gets or sets the number of tasks to use for evaluation.

public int EvaluationTasks { get; set; }

Property Value

int: Default: 100.

GradientClipThreshold

Gets or sets the maximum gradient norm for gradient clipping.

public double? GradientClipThreshold { get; set; }

Property Value

double?: Default: 10.0 (prevents exploding gradients).

Remarks

Gradient clipping limits the magnitude of gradients during training, preventing numerical instability from exploding gradients.

InnerLearningRate

Gets or sets the learning rate for the inner loop (task adaptation).

public double InnerLearningRate { get; set; }

Property Value

double: Default: 0.01 (standard for meta-learning inner loops).

Remarks

This controls how quickly the model adapts to each task during the inner loop. A smaller value leads to more stable but slower adaptation; larger values can cause unstable training.

For Beginners: This is like how big of steps you take when learning a specific task. Smaller steps are safer but slower; bigger steps are faster but might overshoot the optimal solution.

InnerOptimizer

Gets or sets the optimizer for inner-loop adaptation.

public IGradientBasedOptimizer<T, TInput, TOutput>? InnerOptimizer { get; set; }

Property Value

IGradientBasedOptimizer<T, TInput, TOutput>: Default: null (uses built-in SGD optimizer with InnerLearningRate).

Remarks

The inner optimizer performs task-specific adaptation on the support set. SGD is typically used for simplicity and computational efficiency.

LossFunction

Gets or sets the loss function for training.

public ILossFunction<T>? LossFunction { get; set; }

Property Value

ILossFunction<T>: Default: null (uses model's default loss function if available).

MetaBatchSize

Gets or sets the number of tasks to sample per meta-training iteration.

public int MetaBatchSize { get; set; }

Property Value

int: Default: 4 (typical meta-batch size).

Remarks

Larger batch sizes provide more stable gradient estimates but require more memory and computation per iteration.

MetaModel

Gets or sets the meta-model to be trained. This is the only required property.

public IFullModel<T, TInput, TOutput> MetaModel { get; set; }

Property Value

IFullModel<T, TInput, TOutput>

Remarks

The model must implement IFullModel to support parameter getting/setting and gradient computation required for SEAL's meta-learning process.

MetaOptimizer

Gets or sets the optimizer for meta-parameter updates (outer loop).

public IGradientBasedOptimizer<T, TInput, TOutput>? MetaOptimizer { get; set; }

Property Value

IGradientBasedOptimizer<T, TInput, TOutput>: Default: null (uses built-in Adam optimizer with OuterLearningRate).

Remarks

The meta-optimizer updates the initial model parameters based on performance across all tasks in the meta-batch. Adam is recommended for stable convergence.

MinTemperature

Gets or sets the minimum temperature for temperature annealing.

public double MinTemperature { get; set; }

Property Value

double: Default: 1.0 (no annealing, constant temperature).

Remarks

When MinTemperature is less than Temperature, the temperature will linearly decrease from Temperature to MinTemperature over the course of training. This allows the model to be more exploratory early and more confident later.

NumMetaIterations

Gets or sets the total number of meta-training iterations to perform.

public int NumMetaIterations { get; set; }

Property Value

int: Default: 1000 (typical meta-training length).

OuterLearningRate

Gets or sets the learning rate for the outer loop (meta-optimization).

public double OuterLearningRate { get; set; }

Property Value

double: Default: 0.001 (typically 10x smaller than inner rate).

Remarks

This controls how the meta-parameters (initial model weights) are updated based on performance across all tasks. Typically smaller than inner learning rate for stable meta-learning.

RandomSeed

Gets or sets the random seed for reproducibility.

public int? RandomSeed { get; set; }

Property Value

int?: Default: null (non-deterministic).

Temperature

Gets or sets the temperature scaling factor for the loss function.

public double Temperature { get; set; }

Property Value

double: Default: 1.0 (no temperature scaling).

Remarks

Temperature scaling divides the loss by the temperature value: - Temperature > 1.0: Softens predictions, reduces confidence - Temperature < 1.0: Sharpens predictions, increases confidence - Temperature = 1.0: No effect (standard loss computation)

For Beginners: Temperature is like adjusting how "certain" the model should be about its predictions: - High temperature: Model becomes more humble, spreads probability across options - Low temperature: Model becomes more confident, concentrates probability on top choice

UseAdaptiveInnerLR

Gets or sets whether to use adaptive inner learning rates.

public bool UseAdaptiveInnerLR { get; set; }

Property Value

bool: Default: false (use fixed inner learning rate).

Remarks

When enabled, SEAL computes per-parameter learning rates based on gradient norms during inner loop adaptation. Parameters with larger gradients get smaller learning rates (similar to AdaGrad's approach).

For Beginners: Instead of using the same step size for all parameters, adaptive learning rates adjust the step size for each parameter based on how much it has been changing. Parameters that change a lot get smaller steps to prevent overshooting.

UseFirstOrder

Gets or sets whether to use first-order approximation (FOMAML-style).

public bool UseFirstOrder { get; set; }

Property Value

bool: Default: true (recommended for computational efficiency).

Remarks

First-order approximation ignores second-order gradient terms, reducing computational complexity from O(n^3) to O(n) with minimal performance loss. Most production implementations use first-order methods.

For Beginners: When true, SEAL uses a faster but slightly less accurate way to compute gradients. In practice, this works almost as well as the exact method but is much faster.

WeightDecay

Gets or sets the weight decay (L2 regularization) coefficient.

public double WeightDecay { get; set; }

Property Value

double: Default: 0.0 (no weight decay).

Remarks

Weight decay adds a penalty proportional to the squared magnitude of weights: Gradient = Original_Gradient + WeightDecay * Parameters

This helps prevent overfitting by keeping weights small.

For Beginners: Weight decay is like a "simplicity penalty" that discourages the model from having very large weights. Large weights often indicate overfitting, so keeping them small helps the model generalize.

Methods

Clone()

Creates a deep copy of the SEAL options.

public IMetaLearnerOptions<T> Clone()

Returns

IMetaLearnerOptions<T>: A new SEALOptions instance with the same configuration values.

IsValid()

Validates that all SEAL configuration options are properly set.

public bool IsValid()

Returns

bool: True if the configuration is valid for SEAL training; otherwise, false.

Remarks

Checks all required hyperparameters including SEAL-specific ones: - Standard meta-learning parameters (learning rates, steps, etc.) - Temperature must be positive - Entropy coefficient must be non-negative - Weight decay must be non-negative - Adaptive learning rate parameters must be valid

Table of Contents

Class SEALOptions<T, TInput, TOutput>

Type Parameters

Remarks

Constructors

SEALOptions(IFullModel<T, TInput, TOutput>)

Parameters

Examples

Exceptions

Properties

AdaptationSteps

Property Value

Remarks

AdaptiveLearningRateDecay

Property Value

Remarks

AdaptiveLearningRateEpsilon

Property Value

Remarks

AdaptiveLearningRateMode

Property Value

Remarks

CheckpointFrequency

Property Value

DataLoader

Property Value

EnableCheckpointing

Property Value

EntropyCoefficient

Property Value

Remarks

EntropyOnlyDuringMetaTrain

Property Value

Remarks

EvaluationFrequency

Property Value

EvaluationTasks

Property Value

GradientClipThreshold

Property Value

Remarks

InnerLearningRate

Property Value

Remarks

InnerOptimizer

Property Value

Remarks

LossFunction

Property Value

MetaBatchSize

Property Value

Remarks

MetaModel

Property Value

Remarks

MetaOptimizer

Property Value

Remarks

MinTemperature

Property Value

Remarks

NumMetaIterations

Property Value

OuterLearningRate

Property Value

Remarks

RandomSeed

Property Value

Temperature

Property Value

Remarks

UseAdaptiveInnerLR

Property Value

Remarks

UseFirstOrder

Property Value

Remarks

WeightDecay

Property Value

Remarks