Class AlignmentMethodOptions<T>
Configuration options for AI alignment methods.
public class AlignmentMethodOptions<T>
Type Parameters
TThe numeric data type used for calculations (e.g., float, double).
- Inheritance
-
AlignmentMethodOptions<T>
- Inherited Members
Remarks
These options control how models are aligned with human values and intentions through techniques like RLHF, constitutional AI, and red teaming.
For Beginners: These settings control how your AI learns to behave according to human values. You can adjust how much to weight human feedback, what principles to follow, and how thoroughly to test for problems.
Properties
CritiqueIterations
Gets or sets the number of critique iterations for constitutional AI.
public int CritiqueIterations { get; set; }
Property Value
- int
The number of critique iterations, defaulting to 3.
Remarks
For Beginners: The model critiques and improves its own outputs this many times using the constitutional principles.
EnableRedTeaming
Gets or sets whether to perform red teaming.
public bool EnableRedTeaming { get; set; }
Property Value
- bool
True to enable red teaming, false otherwise (default: true).
Remarks
For Beginners: Red teaming tries to find ways to make the model misbehave, helping you discover and fix problems before deployment.
Gamma
Gets or sets the discount factor for reward modeling.
public double Gamma { get; set; }
Property Value
- double
The gamma value (0-1), defaulting to 0.99.
Remarks
For Beginners: This controls how much the model values long-term rewards vs. immediate rewards. Higher values make it consider future consequences more.
KLCoefficient
Gets or sets the KL divergence penalty coefficient.
public double KLCoefficient { get; set; }
Property Value
- double
The KL coefficient, defaulting to 0.1.
Remarks
For Beginners: This prevents the model from changing too much during alignment. It's like a leash that keeps the aligned model close to the original model's behavior.
LearningRate
Gets or sets the learning rate for alignment training.
public double LearningRate { get; set; }
Property Value
- double
The learning rate, defaulting to 0.00001.
Remarks
For Beginners: This controls how quickly the model adapts to human feedback. Smaller values make training more stable but slower.
RedTeamingAttempts
Gets or sets the number of red teaming attempts.
public int RedTeamingAttempts { get; set; }
Property Value
- int
The number of attempts, defaulting to 100.
Remarks
For Beginners: More attempts mean more thorough testing but take longer.
RewardModelArchitecture
Gets or sets the reward model architecture.
public string RewardModelArchitecture { get; set; }
Property Value
- string
The model architecture name, defaulting to "Transformer".
TrainingIterations
Gets or sets the number of training iterations for RLHF.
public int TrainingIterations { get; set; }
Property Value
- int
The number of iterations, defaulting to 1000.
Remarks
For Beginners: More iterations allow the model to learn better from feedback but take longer to train.
UseConstitutionalAI
Gets or sets whether to use constitutional AI principles.
public bool UseConstitutionalAI { get; set; }
Property Value
- bool
True to use constitutional AI, false otherwise (default: true).
Remarks
For Beginners: Constitutional AI teaches the model explicit principles to guide its behavior, like a set of rules to follow.