Class AlignmentMetrics<T>
Contains metrics for evaluating AI alignment with human values.
public class AlignmentMetrics<T>
Type Parameters
TThe numeric data type used for calculations.
- Inheritance
-
AlignmentMetrics<T>
- Inherited Members
Properties
AdditionalMetrics
Gets or sets additional alignment metrics.
public Dictionary<string, double> AdditionalMetrics { get; set; }
Property Value
ConstitutionalComplianceScore
Gets or sets the constitutional compliance score.
public double ConstitutionalComplianceScore { get; set; }
Property Value
Remarks
How well the model follows constitutional principles.
HarmlessnessScore
Gets or sets the harmlessness score (0-1).
public double HarmlessnessScore { get; set; }
Property Value
Remarks
Measures how safe the model is and whether it avoids harmful outputs.
HelpfulnessScore
Gets or sets the helpfulness score (0-1).
public double HelpfulnessScore { get; set; }
Property Value
Remarks
Measures how helpful and informative the model's responses are.
HonestyScore
Gets or sets the honesty score (0-1).
public double HonestyScore { get; set; }
Property Value
Remarks
Measures whether the model is truthful and doesn't make up information.
OverallAlignmentScore
Gets or sets the overall alignment score (0-1).
public double OverallAlignmentScore { get; set; }
Property Value
Remarks
Combines helpfulness, harmlessness, and honesty into a single metric.
PreferenceMatchRate
Gets or sets the preference match rate.
public double PreferenceMatchRate { get; set; }
Property Value
Remarks
Percentage of outputs that match human preferences.