Table of Contents

Class LEOAlgorithm<T, TInput, TOutput>

Namespace
AiDotNet.MetaLearning.Algorithms
Assembly
AiDotNet.dll

Implementation of Latent Embedding Optimization (LEO) meta-learning algorithm.

public class LEOAlgorithm<T, TInput, TOutput> : MetaLearnerBase<T, TInput, TOutput>, IMetaLearner<T, TInput, TOutput>

Type Parameters

T

The numeric type used for calculations (e.g., double, float).

TInput

The input data type (e.g., Matrix<T>, Tensor<T>).

TOutput

The output data type (e.g., Vector<T>, Tensor<T>).

Inheritance
MetaLearnerBase<T, TInput, TOutput>
LEOAlgorithm<T, TInput, TOutput>
Implements
IMetaLearner<T, TInput, TOutput>
Inherited Members

Remarks

LEO (Latent Embedding Optimization) performs meta-learning by learning a low-dimensional latent space for model parameters. This enables fast adaptation even for large models by working in a compressed representation space.

Key Innovation: Instead of adapting parameters directly (like MAML), LEO:

  1. Encodes support set into a latent code z
  2. Decodes z into classifier parameters θ = g(z)
  3. Adapts in latent space: z' = z - α∇_z L(θ)
  4. Decodes adapted code: θ' = g(z')

For Beginners: Imagine your neural network has millions of parameters. Updating them all with just 5 examples is risky - you might overfit badly. LEO learns to "compress" the parameter space into maybe 64 numbers. When adapting to a new task:

  1. Look at the support examples and generate 64 numbers (latent code)
  2. Convert those 64 numbers into full model parameters
  3. If it doesn't work well, adjust the 64 numbers (not millions!)
  4. Convert again to get updated parameters

This is safer because adjusting 64 numbers can't cause as much overfitting as adjusting millions of parameters.

Variational Aspect: LEO uses a variational autoencoder-like setup where: - Encoder outputs mean μ and variance σ² of a Gaussian distribution - Latent code is sampled: z ~ N(μ, σ²) - KL divergence regularizes z toward a standard Gaussian This prevents the latent space from collapsing and enables uncertainty estimation.

Reference: Rusu, A. A., Rao, D., Sygnowski, J., et al. (2019). Meta-Learning with Latent Embedding Optimization. ICLR 2019.

Constructors

LEOAlgorithm(LEOOptions<T, TInput, TOutput>)

Initializes a new instance of the LEOAlgorithm class.

public LEOAlgorithm(LEOOptions<T, TInput, TOutput> options)

Parameters

options LEOOptions<T, TInput, TOutput>

LEO configuration options containing the model and all hyperparameters.

Examples

// Create LEO with minimal configuration
var options = new LEOOptions<double, Tensor, Tensor>(myNeuralNetwork);
var leo = new LEOAlgorithm<double, Tensor, Tensor>(options);

// Create LEO with custom configuration
var options = new LEOOptions<double, Tensor, Tensor>(myNeuralNetwork)
{
    LatentDimension = 64,
    HiddenDimension = 256,
    KLWeight = 0.01,
    AdaptationSteps = 5
};
var leo = new LEOAlgorithm<double, Tensor, Tensor>(options);

Exceptions

ArgumentNullException

Thrown when options is null.

InvalidOperationException

Thrown when required components are not set in options.

Properties

AlgorithmType

Gets the algorithm type identifier for this meta-learner.

public override MetaLearningAlgorithmType AlgorithmType { get; }

Property Value

MetaLearningAlgorithmType

Returns LEO.

Remarks

This property identifies the algorithm as LEO (Latent Embedding Optimization), which performs meta-learning by adapting in a low-dimensional latent space.

Methods

Adapt(IMetaLearningTask<T, TInput, TOutput>)

Adapts the meta-learned model to a new task using latent space optimization.

public override IModel<TInput, TOutput, ModelMetadata<T>> Adapt(IMetaLearningTask<T, TInput, TOutput> task)

Parameters

task IMetaLearningTask<T, TInput, TOutput>

The new task containing support set examples for adaptation.

Returns

IModel<TInput, TOutput, ModelMetadata<T>>

A new model instance that has been adapted to the given task.

Remarks

LEO adaptation is performed entirely in the latent space:

  1. Extract embeddings from support examples
  2. Encode embeddings to get latent distribution (μ, σ²)
  3. Use mean μ as initial latent code (no sampling at test time)
  4. Decode to get initial classifier parameters
  5. Perform gradient descent steps in latent space
  6. Decode final latent code to get adapted parameters

For Beginners: At test time with a new task: 1. Look at the support examples and figure out "what kind of task this is" 2. Generate a small code (like 64 numbers) representing the task 3. Convert that code into classifier weights 4. Try classifying, and if it's not good, adjust the code 5. Convert the adjusted code back to get better weights

Key Advantage: Adaptation is very fast because we're only optimizing ~64 numbers instead of potentially millions of parameters. This also prevents overfitting because the latent space constrains what kinds of parameter updates are possible.

Exceptions

ArgumentNullException

Thrown when task is null.

MetaTrain(TaskBatch<T, TInput, TOutput>)

Performs one meta-training step using LEO's latent space adaptation approach.

public override T MetaTrain(TaskBatch<T, TInput, TOutput> taskBatch)

Parameters

taskBatch TaskBatch<T, TInput, TOutput>

A batch of tasks to meta-train on, each containing support and query sets.

Returns

T

The average meta-loss across all tasks in the batch (evaluated on query sets).

Remarks

LEO meta-training involves training three main components:

  1. Encoder: Maps support embeddings to latent distribution (μ, σ²)
  2. Decoder: Maps latent code to classifier parameters
  3. Feature Encoder: Maps inputs to embeddings

Training Loop per Task:

1. Extract embeddings from support set
2. Encode to get latent distribution: (μ, σ²) = encoder(embeddings)
3. Sample latent code: z ~ N(μ, σ²)
4. Decode to get initial parameters: θ = decoder(z)
5. For each adaptation step:
   a. Compute support loss with current parameters
   b. Compute gradient with respect to z (not θ!)
   c. Update z' = z - α∇_z L
   d. Decode: θ' = decoder(z')
6. Compute query loss with adapted parameters
7. Add KL divergence loss: KL(N(μ, σ²) || N(0, I))

For Beginners: During training, LEO learns: - How to look at examples and generate a good "summary" (latent code) - How to convert that summary into classifier weights - How to adjust the summary when the initial guess doesn't work

Exceptions

ArgumentException

Thrown when the task batch is null or empty.

InvalidOperationException

Thrown when meta-gradient computation fails.