Class LEOAlgorithm<T, TInput, TOutput>
- Namespace
- AiDotNet.MetaLearning.Algorithms
- Assembly
- AiDotNet.dll
Implementation of Latent Embedding Optimization (LEO) meta-learning algorithm.
public class LEOAlgorithm<T, TInput, TOutput> : MetaLearnerBase<T, TInput, TOutput>, IMetaLearner<T, TInput, TOutput>
Type Parameters
TThe numeric type used for calculations (e.g., double, float).
TInputThe input data type (e.g., Matrix<T>, Tensor<T>).
TOutputThe output data type (e.g., Vector<T>, Tensor<T>).
- Inheritance
-
MetaLearnerBase<T, TInput, TOutput>LEOAlgorithm<T, TInput, TOutput>
- Implements
-
IMetaLearner<T, TInput, TOutput>
- Inherited Members
Remarks
LEO (Latent Embedding Optimization) performs meta-learning by learning a low-dimensional latent space for model parameters. This enables fast adaptation even for large models by working in a compressed representation space.
Key Innovation: Instead of adapting parameters directly (like MAML), LEO:
- Encodes support set into a latent code z
- Decodes z into classifier parameters θ = g(z)
- Adapts in latent space: z' = z - α∇_z L(θ)
- Decodes adapted code: θ' = g(z')
For Beginners: Imagine your neural network has millions of parameters. Updating them all with just 5 examples is risky - you might overfit badly. LEO learns to "compress" the parameter space into maybe 64 numbers. When adapting to a new task:
- Look at the support examples and generate 64 numbers (latent code)
- Convert those 64 numbers into full model parameters
- If it doesn't work well, adjust the 64 numbers (not millions!)
- Convert again to get updated parameters
This is safer because adjusting 64 numbers can't cause as much overfitting as adjusting millions of parameters.
Variational Aspect: LEO uses a variational autoencoder-like setup where: - Encoder outputs mean μ and variance σ² of a Gaussian distribution - Latent code is sampled: z ~ N(μ, σ²) - KL divergence regularizes z toward a standard Gaussian This prevents the latent space from collapsing and enables uncertainty estimation.
Reference: Rusu, A. A., Rao, D., Sygnowski, J., et al. (2019). Meta-Learning with Latent Embedding Optimization. ICLR 2019.
Constructors
LEOAlgorithm(LEOOptions<T, TInput, TOutput>)
Initializes a new instance of the LEOAlgorithm class.
public LEOAlgorithm(LEOOptions<T, TInput, TOutput> options)
Parameters
optionsLEOOptions<T, TInput, TOutput>LEO configuration options containing the model and all hyperparameters.
Examples
// Create LEO with minimal configuration
var options = new LEOOptions<double, Tensor, Tensor>(myNeuralNetwork);
var leo = new LEOAlgorithm<double, Tensor, Tensor>(options);
// Create LEO with custom configuration
var options = new LEOOptions<double, Tensor, Tensor>(myNeuralNetwork)
{
LatentDimension = 64,
HiddenDimension = 256,
KLWeight = 0.01,
AdaptationSteps = 5
};
var leo = new LEOAlgorithm<double, Tensor, Tensor>(options);
Exceptions
- ArgumentNullException
Thrown when options is null.
- InvalidOperationException
Thrown when required components are not set in options.
Properties
AlgorithmType
Gets the algorithm type identifier for this meta-learner.
public override MetaLearningAlgorithmType AlgorithmType { get; }
Property Value
- MetaLearningAlgorithmType
Returns LEO.
Remarks
This property identifies the algorithm as LEO (Latent Embedding Optimization), which performs meta-learning by adapting in a low-dimensional latent space.
Methods
Adapt(IMetaLearningTask<T, TInput, TOutput>)
Adapts the meta-learned model to a new task using latent space optimization.
public override IModel<TInput, TOutput, ModelMetadata<T>> Adapt(IMetaLearningTask<T, TInput, TOutput> task)
Parameters
taskIMetaLearningTask<T, TInput, TOutput>The new task containing support set examples for adaptation.
Returns
- IModel<TInput, TOutput, ModelMetadata<T>>
A new model instance that has been adapted to the given task.
Remarks
LEO adaptation is performed entirely in the latent space:
- Extract embeddings from support examples
- Encode embeddings to get latent distribution (μ, σ²)
- Use mean μ as initial latent code (no sampling at test time)
- Decode to get initial classifier parameters
- Perform gradient descent steps in latent space
- Decode final latent code to get adapted parameters
For Beginners: At test time with a new task: 1. Look at the support examples and figure out "what kind of task this is" 2. Generate a small code (like 64 numbers) representing the task 3. Convert that code into classifier weights 4. Try classifying, and if it's not good, adjust the code 5. Convert the adjusted code back to get better weights
Key Advantage: Adaptation is very fast because we're only optimizing ~64 numbers instead of potentially millions of parameters. This also prevents overfitting because the latent space constrains what kinds of parameter updates are possible.
Exceptions
- ArgumentNullException
Thrown when task is null.
MetaTrain(TaskBatch<T, TInput, TOutput>)
Performs one meta-training step using LEO's latent space adaptation approach.
public override T MetaTrain(TaskBatch<T, TInput, TOutput> taskBatch)
Parameters
taskBatchTaskBatch<T, TInput, TOutput>A batch of tasks to meta-train on, each containing support and query sets.
Returns
- T
The average meta-loss across all tasks in the batch (evaluated on query sets).
Remarks
LEO meta-training involves training three main components:
- Encoder: Maps support embeddings to latent distribution (μ, σ²)
- Decoder: Maps latent code to classifier parameters
- Feature Encoder: Maps inputs to embeddings
Training Loop per Task:
1. Extract embeddings from support set
2. Encode to get latent distribution: (μ, σ²) = encoder(embeddings)
3. Sample latent code: z ~ N(μ, σ²)
4. Decode to get initial parameters: θ = decoder(z)
5. For each adaptation step:
a. Compute support loss with current parameters
b. Compute gradient with respect to z (not θ!)
c. Update z' = z - α∇_z L
d. Decode: θ' = decoder(z')
6. Compute query loss with adapted parameters
7. Add KL divergence loss: KL(N(μ, σ²) || N(0, I))
For Beginners: During training, LEO learns: - How to look at examples and generate a good "summary" (latent code) - How to convert that summary into classifier weights - How to adjust the summary when the initial guess doesn't work
Exceptions
- ArgumentException
Thrown when the task batch is null or empty.
- InvalidOperationException
Thrown when meta-gradient computation fails.