Class LeafTokenSequenceFederatedDatasetLoader
- Namespace
- AiDotNet.FederatedLearning.Benchmarks.Leaf
- Assembly
- AiDotNet.dll
Loads LEAF-style JSON files that store token sequences (x) and next-token labels (y).
public sealed class LeafTokenSequenceFederatedDatasetLoader
- Inheritance
-
LeafTokenSequenceFederatedDatasetLoader
- Inherited Members
Remarks
This loader is intentionally generic: it expects the standard LEAF container shape
(users/num_samples/user_data) where each user's x is a list of token sequences
(arrays of strings) and each y is a list of label tokens (strings).
For Beginners: Some federated text benchmarks are easiest to represent as "predict the next token". This loader keeps the per-user splits intact so federated simulations match the benchmark partitioning.
Methods
LoadDatasetFromFiles(string, string?, LeafFederatedDatasetLoadOptions?)
Loads a LEAF token-sequence train dataset and optional test dataset from files.
public LeafFederatedDataset<string[][], string[]> LoadDatasetFromFiles(string trainFilePath, string? testFilePath = null, LeafFederatedDatasetLoadOptions? options = null)
Parameters
trainFilePathstringtestFilePathstringoptionsLeafFederatedDatasetLoadOptions
Returns
- LeafFederatedDataset<string[][], string[]>
LoadSplitFromFile(string, LeafFederatedDatasetLoadOptions?)
Loads a LEAF token-sequence split (train/test) from a JSON file.
public LeafFederatedSplit<string[][], string[]> LoadSplitFromFile(string filePath, LeafFederatedDatasetLoadOptions? options = null)
Parameters
filePathstringoptionsLeafFederatedDatasetLoadOptions
Returns
- LeafFederatedSplit<string[][], string[]>
LoadSplitFromJson(string, LeafFederatedDatasetLoadOptions?)
Loads a LEAF token-sequence split (train/test) from a JSON string.
public LeafFederatedSplit<string[][], string[]> LoadSplitFromJson(string json, LeafFederatedDatasetLoadOptions? options = null)
Parameters
jsonstringoptionsLeafFederatedDatasetLoadOptions
Returns
- LeafFederatedSplit<string[][], string[]>