Class AdversarialRobustnessConfiguration<T, TInput, TOutput>

Namespace: AiDotNet.Models.Options

Assembly: AiDotNet.dll

Configuration for adversarial robustness and AI safety during model building and inference.

public class AdversarialRobustnessConfiguration<T, TInput, TOutput>

Type Parameters

T: The numeric data type used for calculations.
TInput: The input data type for the model.
TOutput: The output data type for the model.

Inheritance: object

AdversarialRobustnessConfiguration<T, TInput, TOutput>

Derived: AdversarialRobustnessConfiguration<T>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

This configuration controls all aspects of adversarial robustness and AI safety, replacing the previous SafetyFilterConfiguration with a unified approach that includes: - Safety filtering (input/output validation) - Adversarial attacks and defenses - Certified robustness - Content moderation - Red teaming

For Beginners: This is your complete safety and robustness configuration. You can enable/disable features, customize options, or provide your own implementations. All settings have sensible defaults based on industry best practices.

Properties

AutoGenerateModelCardSections

Gets or sets whether to auto-generate robustness sections in the model card.

public bool AutoGenerateModelCardSections { get; set; }

Property Value

bool

CustomAttacks

Gets or sets custom adversarial attack implementations.

public IAdversarialAttack<T, TInput, TOutput>[]? CustomAttacks { get; set; }

Property Value

IAdversarialAttack<T, TInput, TOutput>[]

Remarks

Additional attacks beyond the built-in FGSM, PGD, CW, and AutoAttack. These attacks can work with any input/output types supported by the model.

CustomCertifiedDefense

Gets or sets a custom certified defense implementation.

public ICertifiedDefense<T, TInput, TOutput>? CustomCertifiedDefense { get; set; }

Property Value

ICertifiedDefense<T, TInput, TOutput>

Remarks

Custom certified defense that provides provable robustness guarantees.

CustomContentClassifier

Gets or sets a custom content classifier implementation.

public IContentClassifier<T>? CustomContentClassifier { get; set; }

Property Value

IContentClassifier<T>

CustomDefense

Gets or sets a custom adversarial defense implementation.

public IAdversarialDefense<T, TInput, TOutput>? CustomDefense { get; set; }

Property Value

IAdversarialDefense<T, TInput, TOutput>

Remarks

Custom defense mechanism that works with the model's input/output types.

CustomSafetyFilter

Gets or sets a custom safety filter implementation.

public ISafetyFilter<T>? CustomSafetyFilter { get; set; }

Property Value

ISafetyFilter<T>

Remarks

When provided, this filter is used instead of the default implementation.

Enabled

Gets or sets whether adversarial robustness features are enabled.

public bool Enabled { get; set; }

Property Value

bool

Remarks

This is the master switch. When false, all robustness features are skipped.

EvaluationEpsilons

Gets or sets the epsilon values to test during robustness evaluation.

public double[] EvaluationEpsilons { get; set; }

Property Value

double[]

IncludeRobustnessInEvaluation

Gets or sets whether to include robustness evaluation in model evaluation.

public bool IncludeRobustnessInEvaluation { get; set; }

Property Value

bool

IncludeRobustnessMetrics

Gets or sets whether to include robustness metrics in prediction results.

public bool IncludeRobustnessMetrics { get; set; }

Property Value

bool

MinimumCertifiedRadius

Gets or sets the minimum certified radius required for a prediction to be considered robust.

public double MinimumCertifiedRadius { get; set; }

Property Value

double

ModelCardRobustnessNotes

Gets or sets custom model card robustness notes.

public string? ModelCardRobustnessNotes { get; set; }

Property Value

string

Options

Gets or sets the robustness options.

public AdversarialRobustnessOptions<T> Options { get; set; }

Property Value

AdversarialRobustnessOptions<T>

RejectNonRobustPredictions

Gets or sets whether to reject predictions that don't meet the minimum certified radius.

public bool RejectNonRobustPredictions { get; set; }

Property Value

bool

RobustnessEvaluationSampleRatio

Gets or sets the percentage of test data to use for robustness evaluation.

public double RobustnessEvaluationSampleRatio { get; set; }

Property Value

double

Remarks

Robustness evaluation can be slow, so this allows testing on a subset.

UseCertifiedInference

Gets or sets whether to apply certified inference during prediction.

public bool UseCertifiedInference { get; set; }

Property Value

bool

Remarks

When enabled, predictions include certified robustness guarantees. This adds computational overhead but provides provable guarantees.

Methods

BasicSafety()

Creates a configuration with basic safety filtering only.

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> BasicSafety()

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

Comprehensive()

Creates a configuration with comprehensive robustness features.

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> Comprehensive()

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

Disabled()

Creates a disabled configuration (no robustness features).

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> Disabled()

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

ForLLM()

Creates a configuration optimized for LLM safety.

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> ForLLM()

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

WithAdversarialTraining()

Creates a configuration focused on adversarial training.

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> WithAdversarialTraining()

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

WithCertification(string)

Creates a configuration with certified robustness guarantees.

public static AdversarialRobustnessConfiguration<T, TInput, TOutput> WithCertification(string certificationMethod = "RandomizedSmoothing")

Parameters

certificationMethod string: The certification method: "RandomizedSmoothing", "IBP", or "CROWN".

Returns

AdversarialRobustnessConfiguration<T, TInput, TOutput>

Table of Contents

Class AdversarialRobustnessConfiguration<T, TInput, TOutput>

Type Parameters

Remarks

Properties

AutoGenerateModelCardSections

Property Value

CustomAttacks

Property Value

Remarks

CustomCertifiedDefense

Property Value

Remarks

CustomContentClassifier

Property Value

CustomDefense

Property Value

Remarks

CustomSafetyFilter

Property Value

Remarks

Enabled

Property Value

Remarks

EvaluationEpsilons

Property Value

IncludeRobustnessInEvaluation

Property Value

IncludeRobustnessMetrics

Property Value

MinimumCertifiedRadius

Property Value

ModelCardRobustnessNotes

Property Value

Options

Property Value

RejectNonRobustPredictions

Property Value

RobustnessEvaluationSampleRatio

Property Value

Remarks

UseCertifiedInference

Property Value

Remarks

Methods

BasicSafety()

Returns

Comprehensive()

Returns

Disabled()

Returns

ForLLM()

Returns

WithAdversarialTraining()

Returns

WithCertification(string)

Parameters

Returns