Table of Contents

Class SEALOptions<T, TInput, TOutput>

Namespace
AiDotNet.MetaLearning.Options
Assembly
AiDotNet.dll

Configuration options for the SEAL (Sample-Efficient Adaptive Learning) meta-learning algorithm.

public class SEALOptions<T, TInput, TOutput> : IMetaLearnerOptions<T>

Type Parameters

T

The numeric data type used for calculations (e.g., float, double).

TInput

The input data type (e.g., Matrix<T>, Tensor<T>).

TOutput

The output data type (e.g., Vector<T>, Tensor<T>).

Inheritance
SEALOptions<T, TInput, TOutput>
Implements
Inherited Members

Remarks

SEAL is a gradient-based meta-learning algorithm that combines ideas from MAML with sample-efficiency improvements. It learns initial parameters that can be quickly adapted to new tasks with just a few examples, incorporating temperature scaling, entropy regularization, and optional adaptive learning rates.

For Beginners: SEAL learns the best starting point for a model so that it can quickly adapt to new tasks with minimal data. Think of it like learning how to learn - after seeing many tasks, the model knows how to pick up new skills quickly.

Imagine learning to play musical instruments:

  • Learning your first instrument (piano) is hard
  • Learning your second instrument (guitar) is easier
  • By your 5th instrument, you've learned principles that help you pick up any new instrument much faster

SEAL does the same with machine learning models!

Key features of SEAL: - Temperature scaling: Controls confidence in predictions during meta-training - Entropy regularization: Encourages diverse predictions to prevent overconfident models - Adaptive learning rates: Per-parameter learning rate adaptation based on gradient norms - Weight decay: Prevents overfitting to meta-training tasks

Constructors

SEALOptions(IFullModel<T, TInput, TOutput>)

Initializes a new instance of the SEALOptions class with the required meta-model.

public SEALOptions(IFullModel<T, TInput, TOutput> metaModel)

Parameters

metaModel IFullModel<T, TInput, TOutput>

The meta-model to be trained (required).

Examples

// Create SEAL options with minimal configuration (uses all defaults)
var options = new SEALOptions<double, Matrix<double>, Vector<double>>(myNeuralNetwork);
var seal = new SEALAlgorithm<double, Matrix<double>, Vector<double>>(options);

// Create SEAL with entropy regularization for better generalization
var options = new SEALOptions<double, Matrix<double>, Vector<double>>(myNeuralNetwork)
{
    EntropyCoefficient = 0.01,
    Temperature = 1.5,
    UseAdaptiveInnerLR = true
};

// Create SEAL with weight decay and gradient clipping
var options = new SEALOptions<double, Matrix<double>, Vector<double>>(myNeuralNetwork)
{
    WeightDecay = 0.001,
    GradientClipThreshold = 5.0,
    AdaptationSteps = 10
};

Exceptions

ArgumentNullException

Thrown when metaModel is null.

Properties

AdaptationSteps

Gets or sets the number of gradient steps to take during inner loop adaptation.

public int AdaptationSteps { get; set; }

Property Value

int

Default: 5 (standard for few-shot learning scenarios).

Remarks

More adaptation steps allow better task-specific fitting but increase computational cost and memory usage (especially for second-order methods).

For Beginners: This is how many practice rounds you get on each task before being tested. More practice usually helps, but takes more time.

AdaptiveLearningRateDecay

Gets or sets the decay rate for running mean in adaptive learning rates.

public double AdaptiveLearningRateDecay { get; set; }

Property Value

double

Default: 0.99 (slow decay for stable estimates).

Remarks

Only used when AdaptiveLearningRateMode is RunningMean. Higher values give more weight to historical gradients (more stable but slower to adapt).

AdaptiveLearningRateEpsilon

Gets or sets the epsilon value for numerical stability in adaptive learning rates.

public double AdaptiveLearningRateEpsilon { get; set; }

Property Value

double

Default: 1e-8 (small value to prevent division by zero).

Remarks

Used in the denominator when computing adaptive learning rates to prevent division by zero: lr = base_lr / (sqrt(squared_grad) + epsilon)

AdaptiveLearningRateMode

Gets or sets the mode for adaptive learning rate computation.

public SEALAdaptiveLearningRateMode AdaptiveLearningRateMode { get; set; }

Property Value

SEALAdaptiveLearningRateMode

Default: GradientNorm (scale by inverse gradient norm).

Remarks

Different modes for computing adaptive learning rates: - GradientNorm: lr = base_lr / (sqrt(grad^2) + epsilon) [AdaGrad-like] - RunningMean: Maintains exponential moving average of squared gradients [RMSprop-like] - PerLayer: Applies same adaptive rate to all parameters in each layer

CheckpointFrequency

Gets or sets how often to save checkpoints.

public int CheckpointFrequency { get; set; }

Property Value

int

Default: 500.

DataLoader

Gets or sets the episodic data loader for sampling tasks.

public IEpisodicDataLoader<T, TInput, TOutput>? DataLoader { get; set; }

Property Value

IEpisodicDataLoader<T, TInput, TOutput>

Default: null (tasks must be provided manually to MetaTrain).

EnableCheckpointing

Gets or sets whether to save checkpoints during training.

public bool EnableCheckpointing { get; set; }

Property Value

bool

Default: false.

EntropyCoefficient

Gets or sets the entropy regularization coefficient.

public double EntropyCoefficient { get; set; }

Property Value

double

Default: 0.0 (no entropy regularization).

Remarks

Entropy regularization adds a bonus for diverse predictions: Loss = Original_Loss - EntropyCoefficient * Entropy(predictions)

Higher values encourage the model to be less confident and more exploratory, which can help prevent overfitting on meta-training tasks.

For Beginners: Entropy measures how "spread out" the model's predictions are. By adding entropy to the objective, we encourage the model to not be too confident, which helps it generalize better to new tasks.

EntropyOnlyDuringMetaTrain

Gets or sets whether to apply entropy regularization only during meta-training.

public bool EntropyOnlyDuringMetaTrain { get; set; }

Property Value

bool

Default: true (entropy regularization disabled during adaptation).

Remarks

When true, entropy regularization is only applied during meta-training (outer loop) and not during task adaptation (inner loop). This is often desirable as we want focused adaptation during the inner loop.

EvaluationFrequency

Gets or sets how often to evaluate during meta-training.

public int EvaluationFrequency { get; set; }

Property Value

int

Default: 100 (evaluate every 100 iterations).

EvaluationTasks

Gets or sets the number of tasks to use for evaluation.

public int EvaluationTasks { get; set; }

Property Value

int

Default: 100.

GradientClipThreshold

Gets or sets the maximum gradient norm for gradient clipping.

public double? GradientClipThreshold { get; set; }

Property Value

double?

Default: 10.0 (prevents exploding gradients).

Remarks

Gradient clipping limits the magnitude of gradients during training, preventing numerical instability from exploding gradients.

InnerLearningRate

Gets or sets the learning rate for the inner loop (task adaptation).

public double InnerLearningRate { get; set; }

Property Value

double

Default: 0.01 (standard for meta-learning inner loops).

Remarks

This controls how quickly the model adapts to each task during the inner loop. A smaller value leads to more stable but slower adaptation; larger values can cause unstable training.

For Beginners: This is like how big of steps you take when learning a specific task. Smaller steps are safer but slower; bigger steps are faster but might overshoot the optimal solution.

InnerOptimizer

Gets or sets the optimizer for inner-loop adaptation.

public IGradientBasedOptimizer<T, TInput, TOutput>? InnerOptimizer { get; set; }

Property Value

IGradientBasedOptimizer<T, TInput, TOutput>

Default: null (uses built-in SGD optimizer with InnerLearningRate).

Remarks

The inner optimizer performs task-specific adaptation on the support set. SGD is typically used for simplicity and computational efficiency.

LossFunction

Gets or sets the loss function for training.

public ILossFunction<T>? LossFunction { get; set; }

Property Value

ILossFunction<T>

Default: null (uses model's default loss function if available).

MetaBatchSize

Gets or sets the number of tasks to sample per meta-training iteration.

public int MetaBatchSize { get; set; }

Property Value

int

Default: 4 (typical meta-batch size).

Remarks

Larger batch sizes provide more stable gradient estimates but require more memory and computation per iteration.

MetaModel

Gets or sets the meta-model to be trained. This is the only required property.

public IFullModel<T, TInput, TOutput> MetaModel { get; set; }

Property Value

IFullModel<T, TInput, TOutput>

Remarks

The model must implement IFullModel to support parameter getting/setting and gradient computation required for SEAL's meta-learning process.

MetaOptimizer

Gets or sets the optimizer for meta-parameter updates (outer loop).

public IGradientBasedOptimizer<T, TInput, TOutput>? MetaOptimizer { get; set; }

Property Value

IGradientBasedOptimizer<T, TInput, TOutput>

Default: null (uses built-in Adam optimizer with OuterLearningRate).

Remarks

The meta-optimizer updates the initial model parameters based on performance across all tasks in the meta-batch. Adam is recommended for stable convergence.

MinTemperature

Gets or sets the minimum temperature for temperature annealing.

public double MinTemperature { get; set; }

Property Value

double

Default: 1.0 (no annealing, constant temperature).

Remarks

When MinTemperature is less than Temperature, the temperature will linearly decrease from Temperature to MinTemperature over the course of training. This allows the model to be more exploratory early and more confident later.

NumMetaIterations

Gets or sets the total number of meta-training iterations to perform.

public int NumMetaIterations { get; set; }

Property Value

int

Default: 1000 (typical meta-training length).

OuterLearningRate

Gets or sets the learning rate for the outer loop (meta-optimization).

public double OuterLearningRate { get; set; }

Property Value

double

Default: 0.001 (typically 10x smaller than inner rate).

Remarks

This controls how the meta-parameters (initial model weights) are updated based on performance across all tasks. Typically smaller than inner learning rate for stable meta-learning.

RandomSeed

Gets or sets the random seed for reproducibility.

public int? RandomSeed { get; set; }

Property Value

int?

Default: null (non-deterministic).

Temperature

Gets or sets the temperature scaling factor for the loss function.

public double Temperature { get; set; }

Property Value

double

Default: 1.0 (no temperature scaling).

Remarks

Temperature scaling divides the loss by the temperature value: - Temperature > 1.0: Softens predictions, reduces confidence - Temperature < 1.0: Sharpens predictions, increases confidence - Temperature = 1.0: No effect (standard loss computation)

For Beginners: Temperature is like adjusting how "certain" the model should be about its predictions: - High temperature: Model becomes more humble, spreads probability across options - Low temperature: Model becomes more confident, concentrates probability on top choice

UseAdaptiveInnerLR

Gets or sets whether to use adaptive inner learning rates.

public bool UseAdaptiveInnerLR { get; set; }

Property Value

bool

Default: false (use fixed inner learning rate).

Remarks

When enabled, SEAL computes per-parameter learning rates based on gradient norms during inner loop adaptation. Parameters with larger gradients get smaller learning rates (similar to AdaGrad's approach).

For Beginners: Instead of using the same step size for all parameters, adaptive learning rates adjust the step size for each parameter based on how much it has been changing. Parameters that change a lot get smaller steps to prevent overshooting.

UseFirstOrder

Gets or sets whether to use first-order approximation (FOMAML-style).

public bool UseFirstOrder { get; set; }

Property Value

bool

Default: true (recommended for computational efficiency).

Remarks

First-order approximation ignores second-order gradient terms, reducing computational complexity from O(n^3) to O(n) with minimal performance loss. Most production implementations use first-order methods.

For Beginners: When true, SEAL uses a faster but slightly less accurate way to compute gradients. In practice, this works almost as well as the exact method but is much faster.

WeightDecay

Gets or sets the weight decay (L2 regularization) coefficient.

public double WeightDecay { get; set; }

Property Value

double

Default: 0.0 (no weight decay).

Remarks

Weight decay adds a penalty proportional to the squared magnitude of weights: Gradient = Original_Gradient + WeightDecay * Parameters

This helps prevent overfitting by keeping weights small.

For Beginners: Weight decay is like a "simplicity penalty" that discourages the model from having very large weights. Large weights often indicate overfitting, so keeping them small helps the model generalize.

Methods

Clone()

Creates a deep copy of the SEAL options.

public IMetaLearnerOptions<T> Clone()

Returns

IMetaLearnerOptions<T>

A new SEALOptions instance with the same configuration values.

IsValid()

Validates that all SEAL configuration options are properly set.

public bool IsValid()

Returns

bool

True if the configuration is valid for SEAL training; otherwise, false.

Remarks

Checks all required hyperparameters including SEAL-specific ones: - Standard meta-learning parameters (learning rates, steps, etc.) - Temperature must be positive - Entropy coefficient must be non-negative - Weight decay must be non-negative - Adaptive learning rate parameters must be valid