Table of Contents

Class DataSamplerBase

Namespace
AiDotNet.Data.Sampling
Assembly
AiDotNet.dll

Base class for all data samplers providing common functionality.

public abstract class DataSamplerBase : IDataSampler
Inheritance
DataSamplerBase
Implements
Derived
Inherited Members

Remarks

DataSamplerBase provides default implementations for common sampler operations like random seed management and epoch callbacks. All concrete samplers should inherit from this base class rather than implementing IDataSampler directly.

For Beginners: This base class handles the common plumbing that all samplers need, like managing random number generators for reproducibility. When creating a custom sampler, inherit from this class and override GetIndicesCore().

Constructors

DataSamplerBase(int?)

Initializes a new instance of the DataSamplerBase class.

protected DataSamplerBase(int? seed = null)

Parameters

seed int?

Optional random seed for reproducibility.

Fields

CurrentEpoch

The current epoch number (0-based).

protected int CurrentEpoch

Field Value

int

Random

The random number generator used for sampling.

protected Random Random

Field Value

Random

Properties

Length

Gets the total number of samples this sampler will produce per epoch.

public abstract int Length { get; }

Property Value

int

Remarks

This may differ from the dataset size for oversampling or undersampling strategies.

Methods

CreateSequentialIndices(int)

Creates a sequential array of indices from 0 to count-1.

protected static int[] CreateSequentialIndices(int count)

Parameters

count int

The number of indices to create.

Returns

int[]

An array containing indices [0, 1, 2, ..., count-1].

GetIndices()

Returns an enumerable of indices for one epoch of sampling.

public IEnumerable<int> GetIndices()

Returns

IEnumerable<int>

An enumerable of sample indices in the order they should be processed.

Remarks

Each call to this method starts a new epoch. The returned indices determine which samples are included and in what order.

For Beginners: This method provides the "shopping list" of data points to include in this round of training. The order matters for learning!

GetIndicesCore()

Core implementation for generating indices. Override this in derived classes.

protected abstract IEnumerable<int> GetIndicesCore()

Returns

IEnumerable<int>

An enumerable of sample indices.

OnEpochStart(int)

Called at the start of each epoch to allow the sampler to adjust its behavior.

public virtual void OnEpochStart(int epoch)

Parameters

epoch int

The current epoch number (0-based).

Remarks

This method allows samplers to implement epoch-dependent behavior such as: - Curriculum learning: adjusting difficulty thresholds as training progresses - Self-paced learning: updating sample inclusion thresholds - Active learning: refreshing uncertainty estimates

For Beginners: Some sampling strategies change over time. For example, curriculum learning starts with easy examples and gradually adds harder ones. This method tells the sampler "we're starting epoch N" so it can adjust accordingly.

SetSeed(int)

Sets the random seed for reproducible sampling.

public virtual void SetSeed(int seed)

Parameters

seed int

The random seed value.

Remarks

Setting a seed ensures the same sampling order is produced each time, which is important for reproducibility and debugging.

ShuffleIndices(int[])

Performs Fisher-Yates shuffle on an array of indices.

protected void ShuffleIndices(int[] indices)

Parameters

indices int[]

The array to shuffle in-place.