Table of Contents

Interface IAdversarialAttack<T, TInput, TOutput>

Namespace
AiDotNet.Interfaces
Assembly
AiDotNet.dll

Defines the contract for adversarial attack algorithms that generate adversarial examples.

public interface IAdversarialAttack<T, TInput, TOutput> : IModelSerializer

Type Parameters

T

The numeric data type used for calculations (e.g., float, double).

TInput

The input data type for the model (e.g., Vector<T>, string).

TOutput

The output data type for the model (e.g., Vector<T>, int).

Inherited Members

Remarks

An adversarial attack crafts inputs that cause machine learning models to make mistakes, used for robustness testing and improving model security.

For Beginners: Think of an adversarial attack as a "stress test" for your AI model. Just like testing if a building can withstand an earthquake, these attacks test if your model can handle tricky inputs that are designed to fool it.

Common examples of adversarial attacks include:

  • FGSM (Fast Gradient Sign Method): Quick attacks using gradient information
  • PGD (Projected Gradient Descent): More powerful iterative attacks
  • C&W (Carlini & Wagner): Sophisticated optimization-based attacks

Why adversarial attacks matter:

  • They reveal vulnerabilities in models before deployment
  • They help create more robust models through adversarial training
  • They're essential for safety-critical applications (self-driving cars, medical diagnosis)
  • They demonstrate potential security risks

Methods

CalculatePerturbation(TInput, TInput)

Calculates the perturbation added to create an adversarial example.

TInput CalculatePerturbation(TInput original, TInput adversarial)

Parameters

original TInput

The original clean input.

adversarial TInput

The generated adversarial example.

Returns

TInput

The perturbation representation (difference between adversarial and original).

Remarks

For Beginners: This shows you what changes were made to fool the model. By comparing the original input with the adversarial example, you can see exactly what the attack changed. This helps understand how the attack works.

Note: For non-vector inputs (e.g., strings), this returns a representation of the difference that is appropriate for the input type.

GenerateAdversarialBatch(TInput[], TOutput[], IFullModel<T, TInput, TOutput>)

Generates a batch of adversarial examples from multiple clean inputs.

TInput[] GenerateAdversarialBatch(TInput[] inputs, TOutput[] trueLabels, IFullModel<T, TInput, TOutput> targetModel)

Parameters

inputs TInput[]

The batch of clean input data.

trueLabels TOutput[]

The correct labels for each input.

targetModel IFullModel<T, TInput, TOutput>

The model to attack.

Returns

TInput[]

The batch of generated adversarial examples.

Remarks

For Beginners: This is the same as GenerateAdversarialExample, but it processes multiple inputs at once for efficiency. It's like batch processing - instead of attacking one image at a time, you attack many images together.

GenerateAdversarialExample(TInput, TOutput, IFullModel<T, TInput, TOutput>)

Generates adversarial examples from clean input data.

TInput GenerateAdversarialExample(TInput input, TOutput trueLabel, IFullModel<T, TInput, TOutput> targetModel)

Parameters

input TInput

The clean input data to be perturbed.

trueLabel TOutput

The correct label for the input.

targetModel IFullModel<T, TInput, TOutput>

The model to attack.

Returns

TInput

The generated adversarial example.

Remarks

This method takes normal inputs and perturbs them slightly to create adversarial examples that fool the target model while appearing similar to the original inputs.

For Beginners: This is like creating optical illusions for AI. You make tiny changes to an image or input that a human wouldn't notice, but these changes trick the AI into making wrong predictions.

The process typically involves:

  1. Taking a clean input (e.g., an image of a cat)
  2. Calculating how to modify it to fool the model
  3. Creating a modified version (adversarial example)
  4. The model might now think the cat is a dog, even though it looks the same to humans

GetOptions()

Gets the configuration options for the adversarial attack.

AdversarialAttackOptions<T> GetOptions()

Returns

AdversarialAttackOptions<T>

The configuration options for the attack.

Remarks

For Beginners: These are the "settings" for the attack, like:

  • How strong the attack should be (perturbation budget)
  • How many steps to take when crafting the adversarial example
  • What type of perturbation to use (L2, L-infinity, etc.)

Reset()

Resets the attack state to prepare for a fresh attack run.

void Reset()

Remarks

For Beginners: This clears any saved state from previous attacks, ensuring each new attack starts fresh without being influenced by previous runs.