Interface IContinualLearnerConfig<T>
- Namespace
- AiDotNet.ContinualLearning.Interfaces
- Assembly
- AiDotNet.dll
Configuration interface for continual learning algorithms.
public interface IContinualLearnerConfig<T>
Type Parameters
TThe numeric type used for calculations.
Remarks
For Beginners: This interface defines the settings needed for continual learning, such as learning rates, memory constraints, and regularization parameters.
Continual Learning is the ability to learn new tasks sequentially without forgetting previously learned knowledge. This is challenging because neural networks tend to suffer from "catastrophic forgetting" - learning new information overwrites old knowledge.
Common Strategies Include:
- EWC (Elastic Weight Consolidation): Protects important weights from changing
- LwF (Learning without Forgetting): Uses knowledge distillation from teacher model
- GEM (Gradient Episodic Memory): Projects gradients to prevent forgetting
- SI (Synaptic Intelligence): Tracks online importance of weights
- Experience Replay: Stores and replays examples from previous tasks
Reference: Parisi et al. "Continual Lifelong Learning with Neural Networks: A Review" (2019)
Properties
AGemMargin
Margin for A-GEM (Averaged GEM). Default: 0.0.
T AGemMargin { get; }
Property Value
- T
Remarks
Reference: Chaudhry et al. "Efficient Lifelong Learning with A-GEM" (2019)
AGemReferenceGradients
Number of reference gradients for A-GEM. Default: 256.
int AGemReferenceGradients { get; }
Property Value
BatchSize
Batch size for training. Default: 32.
int BatchSize { get; }
Property Value
Remarks
For Beginners: Number of samples processed together before updating weights. Larger batches are more stable but use more memory. Common values: 16, 32, 64, 128.
BiCValidationFraction
Validation set fraction for BiC bias correction. Default: 0.1.
T BiCValidationFraction { get; }
Property Value
- T
Remarks
Reference: Wu et al. "Large Scale Incremental Learning" (2019)
ComputeBackwardTransfer
Compute backward transfer metric. Default: true.
bool ComputeBackwardTransfer { get; }
Property Value
ComputeForwardTransfer
Compute forward transfer metric. Default: true.
bool ComputeForwardTransfer { get; }
Property Value
DistillationTemperature
Temperature for knowledge distillation softmax. Default: 2.0.
T DistillationTemperature { get; }
Property Value
- T
Remarks
For Beginners: Higher temperature makes probability distributions softer, capturing more information about relative class similarities.
Reference: Li and Hoiem "Learning without Forgetting" (2017)
DistillationWeight
Weight for distillation loss vs task loss. Default: 1.0.
T DistillationWeight { get; }
Property Value
- T
EpochsPerTask
Number of training epochs per task. Default: 10.
int EpochsPerTask { get; }
Property Value
Remarks
For Beginners: One epoch means the model sees all training data once. More epochs can improve learning but may also cause overfitting.
EvaluationFrequency
Evaluation frequency (every N epochs). Default: 1.
int EvaluationFrequency { get; }
Property Value
EwcLambda
EWC regularization strength (lambda). Default: 1000.
T EwcLambda { get; }
Property Value
- T
Remarks
For Beginners: Controls how strongly to protect important weights. Higher values (e.g., 5000) prevent more forgetting but may reduce plasticity.
Reference: Kirkpatrick et al. "Overcoming catastrophic forgetting in neural networks" (2017)
FisherSamples
Number of samples for Fisher Information computation. Default: 200.
int FisherSamples { get; }
Property Value
GemMemoryStrength
Memory strength for GEM constraint. Default: 0.5.
T GemMemoryStrength { get; }
Property Value
- T
Remarks
Controls how strictly to enforce the non-forgetting constraint. Higher values more strictly prevent any increase in loss on old tasks.
Reference: Lopez-Paz and Ranzato "Gradient Episodic Memory for Continual Learning" (2017)
GradientClipNorm
Gradient clipping max norm. Default: 1.0.
T GradientClipNorm { get; }
Property Value
- T
HatSmax
Smax value for gradient-based attention. Default: 400.
T HatSmax { get; }
Property Value
- T
HatSparsity
Sparsity coefficient for HAT. Default: 0.01.
T HatSparsity { get; }
Property Value
- T
Remarks
Reference: Serra et al. "Overcoming Catastrophic Forgetting with Hard Attention to the Task" (2018)
ICarlExemplarsPerClass
Number of exemplars per class for iCaRL. Default: 20.
int ICarlExemplarsPerClass { get; }
Property Value
Remarks
Reference: Rebuffi et al. "iCaRL: Incremental Classifier and Representation Learning" (2017)
ICarlUseHerding
Use herding for exemplar selection. Default: true.
bool ICarlUseHerding { get; }
Property Value
LearningRate
Learning rate for training. Default: 0.001.
T LearningRate { get; }
Property Value
- T
Remarks
For Beginners: Controls how much to update the model in each step. Lower values (e.g., 0.0001) mean slower but more stable learning. Higher values (e.g., 0.01) mean faster but potentially unstable learning.
MasLambda
MAS regularization coefficient (lambda). Default: 1.0.
T MasLambda { get; }
Property Value
- T
Remarks
Reference: Aljundi et al. "Memory Aware Synapses" (2018)
MaxTasks
Maximum number of tasks to support. Default: 100.
int MaxTasks { get; }
Property Value
MemorySize
Maximum number of examples to store from previous tasks. Default: 1000.
int MemorySize { get; }
Property Value
Remarks
For Beginners: How many old examples to keep for experience replay. More examples reduce forgetting but use more memory.
MemoryStrategy
Memory sampling strategy. Default: Reservoir.
MemorySamplingStrategy MemoryStrategy { get; }
Property Value
NormalizeFisher
Normalize Fisher Information matrix. Default: true.
bool NormalizeFisher { get; }
Property Value
OnlineEwcGamma
T OnlineEwcGamma { get; }
Property Value
- T
PackNetPruneRatio
Pruning ratio for PackNet. Default: 0.75.
T PackNetPruneRatio { get; }
Property Value
- T
Remarks
Fraction of weights to prune after each task, freeing capacity for new tasks.
Reference: Mallya and Lazebnik "PackNet" (2018)
PackNetRetrainEpochs
Retrain epochs after pruning. Default: 5.
int PackNetRetrainEpochs { get; }
Property Value
PnnLateralScaling
Lateral connection scaling factor. Default: 1.0.
T PnnLateralScaling { get; }
Property Value
- T
PnnUseLateralConnections
Use lateral connections in progressive networks. Default: true.
bool PnnUseLateralConnections { get; }
Property Value
Remarks
Reference: Rusu et al. "Progressive Neural Networks" (2016)
RandomSeed
Random seed for reproducibility. Default: null (random).
int? RandomSeed { get; }
Property Value
- int?
SamplesPerTask
Number of samples per task to store in memory. Default: auto-calculated based on MemorySize.
int SamplesPerTask { get; }
Property Value
SiC
SI regularization coefficient (c). Default: 0.1.
T SiC { get; }
Property Value
- T
Remarks
Reference: Zenke et al. "Continual Learning Through Synaptic Intelligence" (2017)
SiXi
SI dampening factor (xi). Default: 0.1.
T SiXi { get; }
Property Value
- T
UseEmpiricalFisher
Use empirical Fisher (gradient squared) vs true Fisher. Default: true.
bool UseEmpiricalFisher { get; }
Property Value
UseGradientClipping
Enable gradient clipping. Default: false.
bool UseGradientClipping { get; }
Property Value
UsePrioritizedReplay
Use prioritized experience replay based on sample importance. Default: false.
bool UsePrioritizedReplay { get; }
Property Value
UseSoftTargets
Use soft targets from teacher model. Default: true.
bool UseSoftTargets { get; }
Property Value
UseWeightDecay
Enable weight decay regularization. Default: false.
bool UseWeightDecay { get; }
Property Value
WeightDecay
Weight decay coefficient. Default: 0.0001.
T WeightDecay { get; }
Property Value
- T
Methods
IsValid()
Validates the configuration.
bool IsValid()
Returns
- bool
True if the configuration is valid; otherwise, false.