Table of Contents

Class LeafRedditFederatedDatasetLoader

Namespace
AiDotNet.FederatedLearning.Benchmarks.Leaf
Assembly
AiDotNet.dll

Loads the LEAF Reddit benchmark JSON files into per-client token-sequence datasets.

public sealed class LeafRedditFederatedDatasetLoader
Inheritance
LeafRedditFederatedDatasetLoader
Inherited Members

Remarks

The LEAF Reddit preprocessing pipeline stores each sample as a list of token chunks (x) and a metadata object (y) containing target_tokens (shifted next-token targets) and optional count_tokens. This loader converts each sample into a single fixed-length token sequence paired with a single next-token label (v1: last non-pad target token).

For Beginners: Reddit is huge. This loader supports loading a subset of users and sampling per user so you can run CI-friendly benchmark checks.

Methods

LoadDatasetFromFiles(string, string?, LeafFederatedDatasetLoadOptions?)

Loads a LEAF Reddit train dataset and optional test dataset from files.

public LeafFederatedDataset<string[][], string[]> LoadDatasetFromFiles(string trainFilePath, string? testFilePath = null, LeafFederatedDatasetLoadOptions? options = null)

Parameters

trainFilePath string
testFilePath string
options LeafFederatedDatasetLoadOptions

Returns

LeafFederatedDataset<string[][], string[]>

LoadSplitFromFile(string, LeafFederatedDatasetLoadOptions?)

Loads a LEAF Reddit split (train/test) from a JSON file.

public LeafFederatedSplit<string[][], string[]> LoadSplitFromFile(string filePath, LeafFederatedDatasetLoadOptions? options = null)

Parameters

filePath string
options LeafFederatedDatasetLoadOptions

Returns

LeafFederatedSplit<string[][], string[]>

LoadSplitFromJson(string, LeafFederatedDatasetLoadOptions?)

Loads a LEAF Reddit split (train/test) from a JSON string.

public LeafFederatedSplit<string[][], string[]> LoadSplitFromJson(string json, LeafFederatedDatasetLoadOptions? options = null)

Parameters

json string
options LeafFederatedDatasetLoadOptions

Returns

LeafFederatedSplit<string[][], string[]>