Class EasyToHardCurriculumStrategy<T>
- Namespace
- AiDotNet.KnowledgeDistillation.Strategies
- Assembly
- AiDotNet.dll
Curriculum distillation strategy that progresses from easy to hard samples.
public class EasyToHardCurriculumStrategy<T> : CurriculumDistillationStrategyBase<T>, IDistillationStrategy<T>, ICurriculumDistillationStrategy<T>
Type Parameters
TThe numeric type for calculations (e.g., double, float).
- Inheritance
-
EasyToHardCurriculumStrategy<T>
- Implements
- Inherited Members
Remarks
For Beginners: Easy-to-hard curriculum learning mimics how humans learn best: start with simple concepts and gradually introduce more complex ones. This strategy filters training samples and adjusts temperature based on difficulty and training progress.
How It Works: 1. **Early Training** (progress 0.0-0.3): - Include only easy samples (difficulty ≤ 0.3) - Use high temperature (soft targets, gentle learning) 2. **Mid Training** (progress 0.3-0.7): - Include easy and medium samples (difficulty ≤ 0.7) - Gradually decrease temperature 3. **Late Training** (progress 0.7-1.0): - Include all samples (even hard ones) - Use low temperature (sharp targets, challenging)
Temperature Progression: Starts at MaxTemperature (e.g., 5.0) and linearly decreases to MinTemperature (e.g., 2.0) as training progresses. This makes distillation progressively more challenging.
Sample Filtering: At progress P, only include samples with difficulty ≤ P. Example: At 50% progress, only samples with difficulty ≤ 0.5 are included.
Real-World Analogy: Learning mathematics: Start with addition (easy), then multiplication (medium), then algebra (hard). Don't try to teach calculus to someone who hasn't learned addition!
Best For: - Training from scratch - Datasets with clear difficulty levels - Preventing student from being overwhelmed early - Improving convergence speed and final performance
References: - Bengio et al. (2009). Curriculum Learning. ICML. - Kumar et al. (2010). Self-paced Learning for Latent Variable Models.
Constructors
EasyToHardCurriculumStrategy(double, double, double, double, int, Dictionary<int, double>?)
Initializes a new instance of the EasyToHardCurriculumStrategy class.
public EasyToHardCurriculumStrategy(double baseTemperature = 3, double alpha = 0.3, double minTemperature = 2, double maxTemperature = 5, int totalSteps = 100, Dictionary<int, double>? sampleDifficulties = null)
Parameters
baseTemperaturedoubleBase temperature for distillation (default: 3.0).
alphadoubleBalance between hard and soft loss (default: 0.3).
minTemperaturedoubleEnding temperature for hard samples (default: 2.0).
maxTemperaturedoubleStarting temperature for easy samples (default: 5.0).
totalStepsintTotal training steps/epochs (default: 100).
sampleDifficultiesDictionary<int, double>Optional pre-defined difficulty scores (0.0=easy, 1.0=hard).
Remarks
For Beginners: Set totalSteps to your number of training epochs. Optionally provide sampleDifficulties to control which samples appear when.
Example:
// Define sample difficulties (optional, can be computed automatically)
var difficulties = new Dictionary<int, double>
{
{ 0, 0.1 }, // Very easy sample
{ 1, 0.3 }, // Easy sample
{ 2, 0.5 }, // Medium sample
{ 3, 0.8 }, // Hard sample
{ 4, 0.95 } // Very hard sample
};
var strategy = new EasyToHardCurriculumStrategy<double>(
minTemperature: 2.0, // Final temperature (hard phase)
maxTemperature: 5.0, // Initial temperature (easy phase)
totalSteps: 100, // 100 epochs
sampleDifficulties: difficulties
);
// Training loop
for (int epoch = 0; epoch < 100; epoch++)
{
strategy.UpdateProgress(epoch);
foreach (var (sample, index) in trainingSamples.WithIndex())
{
// Filter samples by curriculum
if (!strategy.ShouldIncludeSample(index))
continue; // Too hard for current stage
// Train on this sample...
}
}
Automatic Difficulty Scoring: If you don't provide difficulties, you can estimate them: - Teacher confidence (lower = harder) - Validation loss (higher = harder) - Number of similar samples (fewer = harder, rare edge case) - Expert annotation
Methods
ComputeCurriculumTemperature()
Computes curriculum temperature that decreases over time (easy to hard).
public override double ComputeCurriculumTemperature()
Returns
- double
Current temperature based on curriculum progress.
Remarks
Algorithm: Temperature decreases linearly from MaxTemperature to MinTemperature. temp = MaxTemp - progress * (MaxTemp - MinTemp)
Example with defaults (min=2.0, max=5.0): - Progress 0.0 (start): Temp = 5.0 (very soft, easy) - Progress 0.25: Temp = 4.25 (softer) - Progress 0.50: Temp = 3.5 (medium) - Progress 0.75: Temp = 2.75 (harder) - Progress 1.0 (end): Temp = 2.0 (sharp, hard)
Intuition: - **High temp early**: Soft targets help student learn basic patterns gently - **Low temp late**: Sharp targets force student to learn precise boundaries
ShouldIncludeSample(int)
Determines if a sample should be included based on curriculum progress.
public override bool ShouldIncludeSample(int sampleIndex)
Parameters
sampleIndexintIndex of the sample.
Returns
- bool
True if sample difficulty is within current curriculum stage.
Remarks
Algorithm: - Get sample difficulty (or assume 0.5 if not set) - Include if difficulty ≤ current progress - Example: At 60% progress, include samples with difficulty ≤ 0.6
Progression Example (100 epochs): - Epoch 0 (0% progress): Only difficulty ≤ 0.0 (almost nothing, need some easy samples!) - Epoch 25 (25%): difficulty ≤ 0.25 (easy samples) - Epoch 50 (50%): difficulty ≤ 0.50 (easy + medium) - Epoch 75 (75%): difficulty ≤ 0.75 (easy + medium + some hard) - Epoch 99 (99%): difficulty ≤ 0.99 (all samples)
Note: Samples without difficulty scores are always included (treated as appropriate for all stages).