Class DataLoaderBase<T>
Abstract base class providing common functionality for all data loaders.
public abstract class DataLoaderBase<T> : IDataLoader<T>, IResettable, ICountable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
DataLoaderBase<T>
- Implements
-
IDataLoader<T>
- Derived
- Inherited Members
Remarks
DataLoaderBase implements shared functionality for all data loaders including: - State management (loaded/unloaded) - Iteration tracking (current index, progress) - Reset functionality - Thread-safe operations where needed
Domain-specific base classes (GraphDataLoaderBase, InputOutputDataLoaderBase) extend this class with specialized functionality.
For Beginners: This class handles the "boring but important" stuff that all data loaders need to do: tracking where you are in the data, resetting to start over, and making sure data is loaded before you try to use it.
When you create a custom data loader, you extend one of the domain-specific base classes (like GraphDataLoaderBase) which in turn extends this class, so you get all this functionality for free.
Constructors
DataLoaderBase(int)
Initializes a new instance of the DataLoaderBase class.
protected DataLoaderBase(int batchSize = 32)
Parameters
batchSizeintThe batch size for iteration. Default is 32.
Properties
BatchCount
Gets the total number of batches based on current batch size.
public int BatchCount { get; }
Property Value
Remarks
This is typically ceil(TotalCount / BatchSize).
BatchSize
Gets or sets the batch size for iteration.
public virtual int BatchSize { get; set; }
Property Value
CurrentBatchIndex
Gets the current batch index in the iteration (0-based).
public int CurrentBatchIndex { get; protected set; }
Property Value
CurrentIndex
Gets the current sample index in the iteration (0-based).
public int CurrentIndex { get; protected set; }
Property Value
Description
Gets a description of the dataset and its intended use.
public virtual string Description { get; }
Property Value
IsLoaded
Gets whether the data has been loaded and is ready for iteration.
public bool IsLoaded { get; protected set; }
Property Value
Name
Gets the human-readable name of this data loader.
public abstract string Name { get; }
Property Value
Remarks
Examples: "MNIST", "Cora Citation Network", "IMDB Reviews"
Progress
Gets the progress through the current epoch as a value from 0.0 to 1.0.
public double Progress { get; }
Property Value
Remarks
Returns CurrentIndex / TotalCount, useful for progress displays.
For Beginners: This gives you a percentage of how far through the data you are: - 0.0 = just started - 0.5 = halfway through - 1.0 = finished (time to reset for next epoch)
TotalCount
Gets the total number of samples in the dataset.
public abstract int TotalCount { get; }
Property Value
Methods
AdvanceBatchIndex()
Advances the batch index by one.
protected void AdvanceBatchIndex()
AdvanceIndex(int)
Advances the current index by the specified amount.
protected void AdvanceIndex(int count)
Parameters
countintNumber of samples to advance by.
EnsureLoaded()
Ensures data is loaded before operations that require it.
protected void EnsureLoaded()
Exceptions
- InvalidOperationException
Thrown when data is not loaded.
LoadAsync(CancellationToken)
Loads the data asynchronously, preparing it for iteration.
public Task LoadAsync(CancellationToken cancellationToken = default)
Parameters
cancellationTokenCancellationTokenToken to cancel the loading operation.
Returns
- Task
A task that completes when loading is finished.
Remarks
This method should be called before attempting to iterate through data. It may perform operations like: - Reading files from disk - Downloading data if implementing IDownloadable - Parsing and preprocessing data - Building indices for efficient access
For Beginners: Call this once at the start to prepare your data. It's async so your program stays responsive while loading large datasets.
LoadDataCoreAsync(CancellationToken)
Core data loading implementation to be provided by derived classes.
protected abstract Task LoadDataCoreAsync(CancellationToken cancellationToken)
Parameters
cancellationTokenCancellationTokenCancellation token for async operation.
Returns
- Task
A task that completes when loading is finished.
Remarks
Derived classes must implement this to perform actual data loading: - Load from files, databases, or remote sources - Parse and validate data format - Store in appropriate internal structures
OnReset()
Called after Reset() to allow derived classes to perform additional reset operations.
protected virtual void OnReset()
Remarks
Override this to reset any domain-specific state. The base indices are already reset when this is called.
Reset()
Resets the iteration state to the beginning of the data.
public virtual void Reset()
Remarks
After calling Reset(), the next call to GetNextBatch() will return the first batch from the beginning of the dataset.
Unload()
Unloads the data and releases associated resources.
public virtual void Unload()
Remarks
Call this when you're done with the dataset to free memory. The loader can be loaded again by calling LoadAsync().
UnloadDataCore()
Core data unloading implementation to be provided by derived classes.
protected abstract void UnloadDataCore()
Remarks
Derived classes should implement this to release resources: - Clear internal data structures - Release file handles or connections - Allow garbage collection of loaded data