Table of Contents

Interface IContinualLearnerConfig<T>

Namespace
AiDotNet.ContinualLearning.Interfaces
Assembly
AiDotNet.dll

Configuration interface for continual learning algorithms.

public interface IContinualLearnerConfig<T>

Type Parameters

T

The numeric type used for calculations.

Remarks

For Beginners: This interface defines the settings needed for continual learning, such as learning rates, memory constraints, and regularization parameters.

Continual Learning is the ability to learn new tasks sequentially without forgetting previously learned knowledge. This is challenging because neural networks tend to suffer from "catastrophic forgetting" - learning new information overwrites old knowledge.

Common Strategies Include:

  • EWC (Elastic Weight Consolidation): Protects important weights from changing
  • LwF (Learning without Forgetting): Uses knowledge distillation from teacher model
  • GEM (Gradient Episodic Memory): Projects gradients to prevent forgetting
  • SI (Synaptic Intelligence): Tracks online importance of weights
  • Experience Replay: Stores and replays examples from previous tasks

Reference: Parisi et al. "Continual Lifelong Learning with Neural Networks: A Review" (2019)

Properties

AGemMargin

Margin for A-GEM (Averaged GEM). Default: 0.0.

T AGemMargin { get; }

Property Value

T

Remarks

Reference: Chaudhry et al. "Efficient Lifelong Learning with A-GEM" (2019)

AGemReferenceGradients

Number of reference gradients for A-GEM. Default: 256.

int AGemReferenceGradients { get; }

Property Value

int

BatchSize

Batch size for training. Default: 32.

int BatchSize { get; }

Property Value

int

Remarks

For Beginners: Number of samples processed together before updating weights. Larger batches are more stable but use more memory. Common values: 16, 32, 64, 128.

BiCValidationFraction

Validation set fraction for BiC bias correction. Default: 0.1.

T BiCValidationFraction { get; }

Property Value

T

Remarks

Reference: Wu et al. "Large Scale Incremental Learning" (2019)

ComputeBackwardTransfer

Compute backward transfer metric. Default: true.

bool ComputeBackwardTransfer { get; }

Property Value

bool

ComputeForwardTransfer

Compute forward transfer metric. Default: true.

bool ComputeForwardTransfer { get; }

Property Value

bool

DistillationTemperature

Temperature for knowledge distillation softmax. Default: 2.0.

T DistillationTemperature { get; }

Property Value

T

Remarks

For Beginners: Higher temperature makes probability distributions softer, capturing more information about relative class similarities.

Reference: Li and Hoiem "Learning without Forgetting" (2017)

DistillationWeight

Weight for distillation loss vs task loss. Default: 1.0.

T DistillationWeight { get; }

Property Value

T

EpochsPerTask

Number of training epochs per task. Default: 10.

int EpochsPerTask { get; }

Property Value

int

Remarks

For Beginners: One epoch means the model sees all training data once. More epochs can improve learning but may also cause overfitting.

EvaluationFrequency

Evaluation frequency (every N epochs). Default: 1.

int EvaluationFrequency { get; }

Property Value

int

EwcLambda

EWC regularization strength (lambda). Default: 1000.

T EwcLambda { get; }

Property Value

T

Remarks

For Beginners: Controls how strongly to protect important weights. Higher values (e.g., 5000) prevent more forgetting but may reduce plasticity.

Reference: Kirkpatrick et al. "Overcoming catastrophic forgetting in neural networks" (2017)

FisherSamples

Number of samples for Fisher Information computation. Default: 200.

int FisherSamples { get; }

Property Value

int

GemMemoryStrength

Memory strength for GEM constraint. Default: 0.5.

T GemMemoryStrength { get; }

Property Value

T

Remarks

Controls how strictly to enforce the non-forgetting constraint. Higher values more strictly prevent any increase in loss on old tasks.

Reference: Lopez-Paz and Ranzato "Gradient Episodic Memory for Continual Learning" (2017)

GradientClipNorm

Gradient clipping max norm. Default: 1.0.

T GradientClipNorm { get; }

Property Value

T

HatSmax

Smax value for gradient-based attention. Default: 400.

T HatSmax { get; }

Property Value

T

HatSparsity

Sparsity coefficient for HAT. Default: 0.01.

T HatSparsity { get; }

Property Value

T

Remarks

Reference: Serra et al. "Overcoming Catastrophic Forgetting with Hard Attention to the Task" (2018)

ICarlExemplarsPerClass

Number of exemplars per class for iCaRL. Default: 20.

int ICarlExemplarsPerClass { get; }

Property Value

int

Remarks

Reference: Rebuffi et al. "iCaRL: Incremental Classifier and Representation Learning" (2017)

ICarlUseHerding

Use herding for exemplar selection. Default: true.

bool ICarlUseHerding { get; }

Property Value

bool

LearningRate

Learning rate for training. Default: 0.001.

T LearningRate { get; }

Property Value

T

Remarks

For Beginners: Controls how much to update the model in each step. Lower values (e.g., 0.0001) mean slower but more stable learning. Higher values (e.g., 0.01) mean faster but potentially unstable learning.

MasLambda

MAS regularization coefficient (lambda). Default: 1.0.

T MasLambda { get; }

Property Value

T

Remarks

Reference: Aljundi et al. "Memory Aware Synapses" (2018)

MaxTasks

Maximum number of tasks to support. Default: 100.

int MaxTasks { get; }

Property Value

int

MemorySize

Maximum number of examples to store from previous tasks. Default: 1000.

int MemorySize { get; }

Property Value

int

Remarks

For Beginners: How many old examples to keep for experience replay. More examples reduce forgetting but use more memory.

MemoryStrategy

Memory sampling strategy. Default: Reservoir.

MemorySamplingStrategy MemoryStrategy { get; }

Property Value

MemorySamplingStrategy

NormalizeFisher

Normalize Fisher Information matrix. Default: true.

bool NormalizeFisher { get; }

Property Value

bool

OnlineEwcGamma

T OnlineEwcGamma { get; }

Property Value

T

PackNetPruneRatio

Pruning ratio for PackNet. Default: 0.75.

T PackNetPruneRatio { get; }

Property Value

T

Remarks

Fraction of weights to prune after each task, freeing capacity for new tasks.

Reference: Mallya and Lazebnik "PackNet" (2018)

PackNetRetrainEpochs

Retrain epochs after pruning. Default: 5.

int PackNetRetrainEpochs { get; }

Property Value

int

PnnLateralScaling

Lateral connection scaling factor. Default: 1.0.

T PnnLateralScaling { get; }

Property Value

T

PnnUseLateralConnections

Use lateral connections in progressive networks. Default: true.

bool PnnUseLateralConnections { get; }

Property Value

bool

Remarks

Reference: Rusu et al. "Progressive Neural Networks" (2016)

RandomSeed

Random seed for reproducibility. Default: null (random).

int? RandomSeed { get; }

Property Value

int?

SamplesPerTask

Number of samples per task to store in memory. Default: auto-calculated based on MemorySize.

int SamplesPerTask { get; }

Property Value

int

SiC

SI regularization coefficient (c). Default: 0.1.

T SiC { get; }

Property Value

T

Remarks

Reference: Zenke et al. "Continual Learning Through Synaptic Intelligence" (2017)

SiXi

SI dampening factor (xi). Default: 0.1.

T SiXi { get; }

Property Value

T

UseEmpiricalFisher

Use empirical Fisher (gradient squared) vs true Fisher. Default: true.

bool UseEmpiricalFisher { get; }

Property Value

bool

UseGradientClipping

Enable gradient clipping. Default: false.

bool UseGradientClipping { get; }

Property Value

bool

UsePrioritizedReplay

Use prioritized experience replay based on sample importance. Default: false.

bool UsePrioritizedReplay { get; }

Property Value

bool

UseSoftTargets

Use soft targets from teacher model. Default: true.

bool UseSoftTargets { get; }

Property Value

bool

UseWeightDecay

Enable weight decay regularization. Default: false.

bool UseWeightDecay { get; }

Property Value

bool

WeightDecay

Weight decay coefficient. Default: 0.0001.

T WeightDecay { get; }

Property Value

T

Methods

IsValid()

Validates the configuration.

bool IsValid()

Returns

bool

True if the configuration is valid; otherwise, false.