Class AdversarialRobustnessConfiguration<T, TInput, TOutput>
Configuration for adversarial robustness and AI safety during model building and inference.
public class AdversarialRobustnessConfiguration<T, TInput, TOutput>
Type Parameters
TThe numeric data type used for calculations.
TInputThe input data type for the model.
TOutputThe output data type for the model.
- Inheritance
-
AdversarialRobustnessConfiguration<T, TInput, TOutput>
- Derived
- Inherited Members
Remarks
This configuration controls all aspects of adversarial robustness and AI safety, replacing the previous SafetyFilterConfiguration with a unified approach that includes: - Safety filtering (input/output validation) - Adversarial attacks and defenses - Certified robustness - Content moderation - Red teaming
For Beginners: This is your complete safety and robustness configuration. You can enable/disable features, customize options, or provide your own implementations. All settings have sensible defaults based on industry best practices.
Properties
AutoGenerateModelCardSections
Gets or sets whether to auto-generate robustness sections in the model card.
public bool AutoGenerateModelCardSections { get; set; }
Property Value
CustomAttacks
Gets or sets custom adversarial attack implementations.
public IAdversarialAttack<T, TInput, TOutput>[]? CustomAttacks { get; set; }
Property Value
- IAdversarialAttack<T, TInput, TOutput>[]
Remarks
Additional attacks beyond the built-in FGSM, PGD, CW, and AutoAttack. These attacks can work with any input/output types supported by the model.
CustomCertifiedDefense
Gets or sets a custom certified defense implementation.
public ICertifiedDefense<T, TInput, TOutput>? CustomCertifiedDefense { get; set; }
Property Value
- ICertifiedDefense<T, TInput, TOutput>
Remarks
Custom certified defense that provides provable robustness guarantees.
CustomContentClassifier
Gets or sets a custom content classifier implementation.
public IContentClassifier<T>? CustomContentClassifier { get; set; }
Property Value
CustomDefense
Gets or sets a custom adversarial defense implementation.
public IAdversarialDefense<T, TInput, TOutput>? CustomDefense { get; set; }
Property Value
- IAdversarialDefense<T, TInput, TOutput>
Remarks
Custom defense mechanism that works with the model's input/output types.
CustomSafetyFilter
Gets or sets a custom safety filter implementation.
public ISafetyFilter<T>? CustomSafetyFilter { get; set; }
Property Value
Remarks
When provided, this filter is used instead of the default implementation.
Enabled
Gets or sets whether adversarial robustness features are enabled.
public bool Enabled { get; set; }
Property Value
Remarks
This is the master switch. When false, all robustness features are skipped.
EvaluationEpsilons
Gets or sets the epsilon values to test during robustness evaluation.
public double[] EvaluationEpsilons { get; set; }
Property Value
- double[]
IncludeRobustnessInEvaluation
Gets or sets whether to include robustness evaluation in model evaluation.
public bool IncludeRobustnessInEvaluation { get; set; }
Property Value
IncludeRobustnessMetrics
Gets or sets whether to include robustness metrics in prediction results.
public bool IncludeRobustnessMetrics { get; set; }
Property Value
MinimumCertifiedRadius
Gets or sets the minimum certified radius required for a prediction to be considered robust.
public double MinimumCertifiedRadius { get; set; }
Property Value
ModelCardRobustnessNotes
Gets or sets custom model card robustness notes.
public string? ModelCardRobustnessNotes { get; set; }
Property Value
Options
Gets or sets the robustness options.
public AdversarialRobustnessOptions<T> Options { get; set; }
Property Value
RejectNonRobustPredictions
Gets or sets whether to reject predictions that don't meet the minimum certified radius.
public bool RejectNonRobustPredictions { get; set; }
Property Value
RobustnessEvaluationSampleRatio
Gets or sets the percentage of test data to use for robustness evaluation.
public double RobustnessEvaluationSampleRatio { get; set; }
Property Value
Remarks
Robustness evaluation can be slow, so this allows testing on a subset.
UseCertifiedInference
Gets or sets whether to apply certified inference during prediction.
public bool UseCertifiedInference { get; set; }
Property Value
Remarks
When enabled, predictions include certified robustness guarantees. This adds computational overhead but provides provable guarantees.
Methods
BasicSafety()
Creates a configuration with basic safety filtering only.
public static AdversarialRobustnessConfiguration<T, TInput, TOutput> BasicSafety()
Returns
- AdversarialRobustnessConfiguration<T, TInput, TOutput>
Comprehensive()
Creates a configuration with comprehensive robustness features.
public static AdversarialRobustnessConfiguration<T, TInput, TOutput> Comprehensive()
Returns
- AdversarialRobustnessConfiguration<T, TInput, TOutput>
Disabled()
Creates a disabled configuration (no robustness features).
public static AdversarialRobustnessConfiguration<T, TInput, TOutput> Disabled()
Returns
- AdversarialRobustnessConfiguration<T, TInput, TOutput>
ForLLM()
Creates a configuration optimized for LLM safety.
public static AdversarialRobustnessConfiguration<T, TInput, TOutput> ForLLM()
Returns
- AdversarialRobustnessConfiguration<T, TInput, TOutput>
WithAdversarialTraining()
Creates a configuration focused on adversarial training.
public static AdversarialRobustnessConfiguration<T, TInput, TOutput> WithAdversarialTraining()
Returns
- AdversarialRobustnessConfiguration<T, TInput, TOutput>
WithCertification(string)
Creates a configuration with certified robustness guarantees.
public static AdversarialRobustnessConfiguration<T, TInput, TOutput> WithCertification(string certificationMethod = "RandomizedSmoothing")
Parameters
certificationMethodstringThe certification method: "RandomizedSmoothing", "IBP", or "CROWN".
Returns
- AdversarialRobustnessConfiguration<T, TInput, TOutput>