Class RedTeamingResults<T>
Contains results from red teaming adversarial testing.
public class RedTeamingResults<T>
Type Parameters
TThe numeric data type used for calculations.
- Inheritance
-
RedTeamingResults<T>
- Inherited Members
Properties
AdversarialPrompts
Gets or sets the adversarial prompts that were tested.
public Matrix<T> AdversarialPrompts { get; set; }
Property Value
- Matrix<T>
AverageSeverity
Gets or sets the average severity of successful attacks.
public double AverageSeverity { get; set; }
Property Value
ModelResponses
Gets or sets the model's responses to adversarial prompts.
public Matrix<T> ModelResponses { get; set; }
Property Value
- Matrix<T>
SeverityScores
Gets or sets the severity scores for each vulnerability (0-1).
public double[] SeverityScores { get; set; }
Property Value
- double[]
SuccessRate
Gets or sets the overall red teaming success rate.
public double SuccessRate { get; set; }
Property Value
SuccessfulAttacks
Gets or sets which prompts successfully caused misaligned behavior.
public bool[] SuccessfulAttacks { get; set; }
Property Value
- bool[]
Vulnerabilities
Gets or sets detailed descriptions of vulnerabilities found.
public List<VulnerabilityReport> Vulnerabilities { get; set; }
Property Value
VulnerabilityTypes
Gets or sets the types of vulnerabilities found.
public string[] VulnerabilityTypes { get; set; }
Property Value
- string[]