Enum BenchmarkSuite
Defines the supported benchmark suites available through the AiDotNet facade.
public enum BenchmarkSuite
Fields
ARCAGI = 13ARC-AGI - abstract reasoning puzzles.
BoolQ = 15BoolQ - yes/no question answering.
CIFAR10 = 6CIFAR-10 - federated image classification (synthetic partitioning of standard CIFAR-10).
CIFAR100 = 7CIFAR-100 - federated image classification (synthetic partitioning of standard CIFAR-100).
CommonsenseQA = 17CommonsenseQA - commonsense multiple-choice QA.
DROP = 14DROP - reading comprehension with discrete reasoning over paragraphs.
FEMNIST = 1FEMNIST - LEAF federated handwritten character classification (per-writer partitioning).
GSM8K = 9Grade School Math 8K (GSM8K) - multi-step math word problems.
HellaSwag = 19HellaSwag - commonsense inference in narrative completion.
HumanEval = 20HumanEval - code generation / program synthesis evaluation.
LEAF = 0LEAF - federated benchmark suite (JSON-based train/test splits).
LogiQA = 22LogiQA - logical reasoning benchmark.
MATH = 10MATH - competition-style mathematics problems.
MBPP = 21MBPP - mostly basic programming problems.
MMLU = 11MMLU - broad multi-subject multiple-choice benchmark.
PIQA = 16PIQA - physical commonsense reasoning.
Reddit = 4Reddit - federated next-token prediction benchmark (LEAF Reddit dataset).
Sent140 = 2Sent140 - LEAF federated sentiment classification benchmark based on tweets.
Shakespeare = 3Shakespeare - LEAF federated next-character prediction benchmark.
StackOverflow = 5StackOverflow - federated next-token prediction benchmark (StackOverflow corpus).
TabularNonIID = 8Generic tabular suite with synthetic non-IID client partitions.
TruthfulQA = 12TruthfulQA - evaluates truthfulness and resistance to hallucination.
WinoGrande = 18WinoGrande - pronoun resolution / commonsense reasoning.
Remarks
For Beginners: A benchmark suite is like a standardized test you can run to measure how well your model performs on a specific family of problems. You select a suite using this enum, and AiDotNet runs the benchmark and returns a structured report.