Interface IAdversarialAttack<T, TInput, TOutput>
- Namespace
- AiDotNet.Interfaces
- Assembly
- AiDotNet.dll
Defines the contract for adversarial attack algorithms that generate adversarial examples.
public interface IAdversarialAttack<T, TInput, TOutput> : IModelSerializer
Type Parameters
TThe numeric data type used for calculations (e.g., float, double).
TInputThe input data type for the model (e.g., Vector<T>, string).
TOutputThe output data type for the model (e.g., Vector<T>, int).
- Inherited Members
Remarks
An adversarial attack crafts inputs that cause machine learning models to make mistakes, used for robustness testing and improving model security.
For Beginners: Think of an adversarial attack as a "stress test" for your AI model. Just like testing if a building can withstand an earthquake, these attacks test if your model can handle tricky inputs that are designed to fool it.
Common examples of adversarial attacks include:
- FGSM (Fast Gradient Sign Method): Quick attacks using gradient information
- PGD (Projected Gradient Descent): More powerful iterative attacks
- C&W (Carlini & Wagner): Sophisticated optimization-based attacks
Why adversarial attacks matter:
- They reveal vulnerabilities in models before deployment
- They help create more robust models through adversarial training
- They're essential for safety-critical applications (self-driving cars, medical diagnosis)
- They demonstrate potential security risks
Methods
CalculatePerturbation(TInput, TInput)
Calculates the perturbation added to create an adversarial example.
TInput CalculatePerturbation(TInput original, TInput adversarial)
Parameters
originalTInputThe original clean input.
adversarialTInputThe generated adversarial example.
Returns
- TInput
The perturbation representation (difference between adversarial and original).
Remarks
For Beginners: This shows you what changes were made to fool the model. By comparing the original input with the adversarial example, you can see exactly what the attack changed. This helps understand how the attack works.
Note: For non-vector inputs (e.g., strings), this returns a representation of the difference that is appropriate for the input type.
GenerateAdversarialBatch(TInput[], TOutput[], IFullModel<T, TInput, TOutput>)
Generates a batch of adversarial examples from multiple clean inputs.
TInput[] GenerateAdversarialBatch(TInput[] inputs, TOutput[] trueLabels, IFullModel<T, TInput, TOutput> targetModel)
Parameters
inputsTInput[]The batch of clean input data.
trueLabelsTOutput[]The correct labels for each input.
targetModelIFullModel<T, TInput, TOutput>The model to attack.
Returns
- TInput[]
The batch of generated adversarial examples.
Remarks
For Beginners: This is the same as GenerateAdversarialExample, but it processes multiple inputs at once for efficiency. It's like batch processing - instead of attacking one image at a time, you attack many images together.
GenerateAdversarialExample(TInput, TOutput, IFullModel<T, TInput, TOutput>)
Generates adversarial examples from clean input data.
TInput GenerateAdversarialExample(TInput input, TOutput trueLabel, IFullModel<T, TInput, TOutput> targetModel)
Parameters
inputTInputThe clean input data to be perturbed.
trueLabelTOutputThe correct label for the input.
targetModelIFullModel<T, TInput, TOutput>The model to attack.
Returns
- TInput
The generated adversarial example.
Remarks
This method takes normal inputs and perturbs them slightly to create adversarial examples that fool the target model while appearing similar to the original inputs.
For Beginners: This is like creating optical illusions for AI. You make tiny changes to an image or input that a human wouldn't notice, but these changes trick the AI into making wrong predictions.
The process typically involves:
- Taking a clean input (e.g., an image of a cat)
- Calculating how to modify it to fool the model
- Creating a modified version (adversarial example)
- The model might now think the cat is a dog, even though it looks the same to humans
GetOptions()
Gets the configuration options for the adversarial attack.
AdversarialAttackOptions<T> GetOptions()
Returns
- AdversarialAttackOptions<T>
The configuration options for the attack.
Remarks
For Beginners: These are the "settings" for the attack, like:
- How strong the attack should be (perturbation budget)
- How many steps to take when crafting the adversarial example
- What type of perturbation to use (L2, L-infinity, etc.)
Reset()
Resets the attack state to prepare for a fresh attack run.
void Reset()
Remarks
For Beginners: This clears any saved state from previous attacks, ensuring each new attack starts fresh without being influenced by previous runs.