Class RedditFederatedBenchmarkOptions
- Namespace
- AiDotNet.Configuration
- Assembly
- AiDotNet.dll
Configuration options for running the Reddit federated benchmark suite.
public sealed class RedditFederatedBenchmarkOptions
- Inheritance
-
RedditFederatedBenchmarkOptions
- Inherited Members
Remarks
Reddit is a large-scale federated text benchmark. This suite uses a token-sequence formulation (next-token prediction) and evaluates models without exposing model internals.
For Beginners: You provide the train/test JSON files, and AiDotNet loads the per-user partitions, builds a vocabulary (with safe defaults), and evaluates your model on a standardized next-token task.
Properties
LoadOptions
Gets or sets load options controlling how many users/clients are loaded.
public LeafFederatedDatasetLoadOptions LoadOptions { get; set; }
Property Value
MaxSamplesPerUser
Gets or sets the maximum number of samples to use per user/client (null uses all available).
public int? MaxSamplesPerUser { get; set; }
Property Value
- int?
MaxVocabularySize
Gets or sets the maximum vocabulary size used for token-to-ID mapping.
public int? MaxVocabularySize { get; set; }
Property Value
- int?
Remarks
If null, AiDotNet uses a sensible default (smaller in CI mode).
SequenceLength
Gets or sets the fixed token sequence length used as model input.
public int? SequenceLength { get; set; }
Property Value
- int?
Remarks
If null, AiDotNet infers the length from the dataset and validates consistency.
TestFilePath
Gets or sets the optional path to the Reddit test split JSON file.
public string? TestFilePath { get; set; }
Property Value
TrainFilePath
Gets or sets the path to the Reddit train split JSON file.
public string? TrainFilePath { get; set; }
Property Value
VocabularyTrainingSampleCount
Gets or sets the maximum number of sequences used to build the default vocabulary.
public int? VocabularyTrainingSampleCount { get; set; }
Property Value
- int?
Remarks
If null, AiDotNet uses a sensible default (smaller in CI mode).