Table of Contents

Class DataLoaderBase<T>

Namespace
AiDotNet.Data.Loaders
Assembly
AiDotNet.dll

Abstract base class providing common functionality for all data loaders.

public abstract class DataLoaderBase<T> : IDataLoader<T>, IResettable, ICountable

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
DataLoaderBase<T>
Implements
Derived
Inherited Members

Remarks

DataLoaderBase implements shared functionality for all data loaders including: - State management (loaded/unloaded) - Iteration tracking (current index, progress) - Reset functionality - Thread-safe operations where needed

Domain-specific base classes (GraphDataLoaderBase, InputOutputDataLoaderBase) extend this class with specialized functionality.

For Beginners: This class handles the "boring but important" stuff that all data loaders need to do: tracking where you are in the data, resetting to start over, and making sure data is loaded before you try to use it.

When you create a custom data loader, you extend one of the domain-specific base classes (like GraphDataLoaderBase) which in turn extends this class, so you get all this functionality for free.

Constructors

DataLoaderBase(int)

Initializes a new instance of the DataLoaderBase class.

protected DataLoaderBase(int batchSize = 32)

Parameters

batchSize int

The batch size for iteration. Default is 32.

Properties

BatchCount

Gets the total number of batches based on current batch size.

public int BatchCount { get; }

Property Value

int

Remarks

This is typically ceil(TotalCount / BatchSize).

BatchSize

Gets or sets the batch size for iteration.

public virtual int BatchSize { get; set; }

Property Value

int

CurrentBatchIndex

Gets the current batch index in the iteration (0-based).

public int CurrentBatchIndex { get; protected set; }

Property Value

int

CurrentIndex

Gets the current sample index in the iteration (0-based).

public int CurrentIndex { get; protected set; }

Property Value

int

Description

Gets a description of the dataset and its intended use.

public virtual string Description { get; }

Property Value

string

IsLoaded

Gets whether the data has been loaded and is ready for iteration.

public bool IsLoaded { get; protected set; }

Property Value

bool

Name

Gets the human-readable name of this data loader.

public abstract string Name { get; }

Property Value

string

Remarks

Examples: "MNIST", "Cora Citation Network", "IMDB Reviews"

Progress

Gets the progress through the current epoch as a value from 0.0 to 1.0.

public double Progress { get; }

Property Value

double

Remarks

Returns CurrentIndex / TotalCount, useful for progress displays.

For Beginners: This gives you a percentage of how far through the data you are: - 0.0 = just started - 0.5 = halfway through - 1.0 = finished (time to reset for next epoch)

TotalCount

Gets the total number of samples in the dataset.

public abstract int TotalCount { get; }

Property Value

int

Methods

AdvanceBatchIndex()

Advances the batch index by one.

protected void AdvanceBatchIndex()

AdvanceIndex(int)

Advances the current index by the specified amount.

protected void AdvanceIndex(int count)

Parameters

count int

Number of samples to advance by.

EnsureLoaded()

Ensures data is loaded before operations that require it.

protected void EnsureLoaded()

Exceptions

InvalidOperationException

Thrown when data is not loaded.

LoadAsync(CancellationToken)

Loads the data asynchronously, preparing it for iteration.

public Task LoadAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Token to cancel the loading operation.

Returns

Task

A task that completes when loading is finished.

Remarks

This method should be called before attempting to iterate through data. It may perform operations like: - Reading files from disk - Downloading data if implementing IDownloadable - Parsing and preprocessing data - Building indices for efficient access

For Beginners: Call this once at the start to prepare your data. It's async so your program stays responsive while loading large datasets.

LoadDataCoreAsync(CancellationToken)

Core data loading implementation to be provided by derived classes.

protected abstract Task LoadDataCoreAsync(CancellationToken cancellationToken)

Parameters

cancellationToken CancellationToken

Cancellation token for async operation.

Returns

Task

A task that completes when loading is finished.

Remarks

Derived classes must implement this to perform actual data loading: - Load from files, databases, or remote sources - Parse and validate data format - Store in appropriate internal structures

OnReset()

Called after Reset() to allow derived classes to perform additional reset operations.

protected virtual void OnReset()

Remarks

Override this to reset any domain-specific state. The base indices are already reset when this is called.

Reset()

Resets the iteration state to the beginning of the data.

public virtual void Reset()

Remarks

After calling Reset(), the next call to GetNextBatch() will return the first batch from the beginning of the dataset.

Unload()

Unloads the data and releases associated resources.

public virtual void Unload()

Remarks

Call this when you're done with the dataset to free memory. The loader can be loaded again by calling LoadAsync().

UnloadDataCore()

Core data unloading implementation to be provided by derived classes.

protected abstract void UnloadDataCore()

Remarks

Derived classes should implement this to release resources: - Clear internal data structures - Release file handles or connections - Allow garbage collection of loaded data