Table of Contents

Interface IInputOutputDataLoader<T, TInput, TOutput>

Namespace
AiDotNet.Interfaces
Assembly
AiDotNet.dll

Interface for data loaders that provide standard input-output (X, Y) data for supervised learning.

public interface IInputOutputDataLoader<T, TInput, TOutput> : IDataLoader<T>, IResettable, ICountable, IBatchIterable<(TInput Features, TOutput Labels)>, IShuffleable

Type Parameters

T

The numeric type used for calculations, typically float or double.

TInput

The input data type (e.g., Matrix<T>, Tensor<T>).

TOutput

The output data type (e.g., Vector<T>, Tensor<T>).

Inherited Members
Extension Methods

Remarks

This interface is for standard supervised learning scenarios where you have: - Input features (X): The data used to make predictions - Output labels (Y): The correct answers the model should learn to predict

For Beginners: Most machine learning tasks fall into this pattern:

Example: House Price Prediction

  • X (inputs): Square footage, number of bedrooms, location, age
  • Y (outputs): The actual house price

Example: Email Spam Detection

  • X (inputs): Email text features (word counts, sender info, etc.)
  • Y (outputs): Label (spam=1, not spam=0)

The data loader loads this data from files, databases, or other sources and provides it in the format your model needs for training.

Properties

FeatureCount

Gets the number of features per sample.

int FeatureCount { get; }

Property Value

int

Features

Gets all input features as a single data structure.

TInput Features { get; }

Property Value

TInput

Remarks

This provides access to the complete feature set. For large datasets, prefer using batch iteration methods instead of loading everything at once.

Labels

Gets all output labels as a single data structure.

TOutput Labels { get; }

Property Value

TOutput

Remarks

This provides access to all labels. For large datasets, prefer using batch iteration methods instead of loading everything at once.

OutputDimension

Gets the number of output dimensions (1 for regression/binary classification, N for multi-class with N classes).

int OutputDimension { get; }

Property Value

int

Methods

Split(double, double, int?)

Creates a train/validation/test split of the data.

(IInputOutputDataLoader<T, TInput, TOutput> Train, IInputOutputDataLoader<T, TInput, TOutput> Validation, IInputOutputDataLoader<T, TInput, TOutput> Test) Split(double trainRatio = 0.7, double validationRatio = 0.15, int? seed = null)

Parameters

trainRatio double

Fraction of data for training (0.0 to 1.0).

validationRatio double

Fraction of data for validation (0.0 to 1.0).

seed int?

Optional random seed for reproducible splits.

Returns

(IInputOutputDataLoader<T, TInput, TOutput> Train, IInputOutputDataLoader<T, TInput, TOutput> Validation, IInputOutputDataLoader<T, TInput, TOutput> Test)

A tuple containing three data loaders: (train, validation, test).

Remarks

The test ratio is implicitly 1 - trainRatio - validationRatio.

For Beginners: Splitting data is crucial for evaluating your model: - **Training set**: Data the model learns from - **Validation set**: Data used to tune hyperparameters and prevent overfitting - **Test set**: Data used only once at the end to get an unbiased performance estimate

Common splits are 60/20/20 or 70/15/15 (train/validation/test).