Table of Contents

Class AdversarialRobustnessConfiguration<T, TInput, TOutput>

Namespace
AiDotNet.Models.Options
Assembly
AiDotNet.dll

Configuration for adversarial robustness and AI safety during model building and inference.

public class AdversarialRobustnessConfiguration<T, TInput, TOutput>

Type Parameters

T

The numeric data type used for calculations.

TInput

The input data type for the model.

TOutput

The output data type for the model.

Inheritance
AdversarialRobustnessConfiguration<T, TInput, TOutput>
Derived
Inherited Members

Remarks

This configuration controls all aspects of adversarial robustness and AI safety, replacing the previous SafetyFilterConfiguration with a unified approach that includes: - Safety filtering (input/output validation) - Adversarial attacks and defenses - Certified robustness - Content moderation - Red teaming

For Beginners: This is your complete safety and robustness configuration. You can enable/disable features, customize options, or provide your own implementations. All settings have sensible defaults based on industry best practices.

Properties

AutoGenerateModelCardSections

Gets or sets whether to auto-generate robustness sections in the model card.

public bool AutoGenerateModelCardSections { get; set; }

Property Value

bool

CustomAttacks

Gets or sets custom adversarial attack implementations.

public IAdversarialAttack<T, TInput, TOutput>[]? CustomAttacks { get; set; }

Property Value

IAdversarialAttack<T, TInput, TOutput>[]

Remarks

Additional attacks beyond the built-in FGSM, PGD, CW, and AutoAttack. These attacks can work with any input/output types supported by the model.

CustomCertifiedDefense

Gets or sets a custom certified defense implementation.

public ICertifiedDefense<T, TInput, TOutput>? CustomCertifiedDefense { get; set; }

Property Value

ICertifiedDefense<T, TInput, TOutput>

Remarks

Custom certified defense that provides provable robustness guarantees.

CustomContentClassifier

Gets or sets a custom content classifier implementation.

public IContentClassifier<T>? CustomContentClassifier { get; set; }

Property Value

IContentClassifier<T>

CustomDefense

Gets or sets a custom adversarial defense implementation.

public IAdversarialDefense<T, TInput, TOutput>? CustomDefense { get; set; }

Property Value

IAdversarialDefense<T, TInput, TOutput>

Remarks

Custom defense mechanism that works with the model's input/output types.

CustomSafetyFilter

Gets or sets a custom safety filter implementation.

public ISafetyFilter<T>? CustomSafetyFilter { get; set; }

Property Value

ISafetyFilter<T>

Remarks

When provided, this filter is used instead of the default implementation.

Enabled

Gets or sets whether adversarial robustness features are enabled.

public bool Enabled { get; set; }

Property Value

bool

Remarks

This is the master switch. When false, all robustness features are skipped.

EvaluationEpsilons

Gets or sets the epsilon values to test during robustness evaluation.

public double[] EvaluationEpsilons { get; set; }

Property Value

double[]

IncludeRobustnessInEvaluation

Gets or sets whether to include robustness evaluation in model evaluation.

public bool IncludeRobustnessInEvaluation { get; set; }

Property Value

bool

IncludeRobustnessMetrics

Gets or sets whether to include robustness metrics in prediction results.

public bool IncludeRobustnessMetrics { get; set; }

Property Value

bool

MinimumCertifiedRadius

Gets or sets the minimum certified radius required for a prediction to be considered robust.

public double MinimumCertifiedRadius { get; set; }

Property Value

double

ModelCardRobustnessNotes

Gets or sets custom model card robustness notes.

public string? ModelCardRobustnessNotes { get; set; }

Property Value

string

Options

Gets or sets the robustness options.

public AdversarialRobustnessOptions<T> Options { get; set; }

Property Value

AdversarialRobustnessOptions<T>

RejectNonRobustPredictions

Gets or sets whether to reject predictions that don't meet the minimum certified radius.

public bool RejectNonRobustPredictions { get; set; }

Property Value

bool

RobustnessEvaluationSampleRatio

Gets or sets the percentage of test data to use for robustness evaluation.

public double RobustnessEvaluationSampleRatio { get; set; }

Property Value

double

Remarks

Robustness evaluation can be slow, so this allows testing on a subset.

UseCertifiedInference

Gets or sets whether to apply certified inference during prediction.

public bool UseCertifiedInference { get; set; }

Property Value

bool

Remarks

When enabled, predictions include certified robustness guarantees. This adds computational overhead but provides provable guarantees.

Methods

BasicSafety()

Creates a configuration with basic safety filtering only.

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> BasicSafety()

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

Comprehensive()

Creates a configuration with comprehensive robustness features.

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> Comprehensive()

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

Disabled()

Creates a disabled configuration (no robustness features).

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> Disabled()

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

ForLLM()

Creates a configuration optimized for LLM safety.

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> ForLLM()

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

WithAdversarialTraining()

Creates a configuration focused on adversarial training.

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> WithAdversarialTraining()

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

WithCertification(string)

Creates a configuration with certified robustness guarantees.

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> WithCertification(string certificationMethod = "RandomizedSmoothing")

Parameters

certificationMethod string

The certification method: "RandomizedSmoothing", "IBP", or "CROWN".

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>