Class CQLOptions<T>
Configuration options for Conservative Q-Learning (CQL) agent.
public class CQLOptions<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
CQLOptions<T>
- Inherited Members
Remarks
CQL is an offline RL algorithm that learns from fixed datasets without environment interaction. It addresses overestimation by adding a conservative penalty to Q-values.
For Beginners: CQL is designed for learning from logged data without trying new actions. This is useful when you have historical data but can't experiment in the real environment (e.g., medical treatment, autonomous driving).
Key innovation:
- Conservative Q-Learning: Penalizes Q-values for unseen actions to prevent overoptimistic estimates
- Offline Learning: No environment interaction during training
Think of it like learning to drive from dashcam footage - you can't try new maneuvers, so you need to be conservative about what you haven't seen.
Based on SAC architecture with conservative regularization.
Constructors
CQLOptions()
public CQLOptions()
Properties
ActionSize
public int ActionSize { get; set; }
Property Value
AlphaLearningRate
public T AlphaLearningRate { get; set; }
Property Value
- T
AutoTuneTemperature
public bool AutoTuneTemperature { get; set; }
Property Value
BatchSize
public int BatchSize { get; set; }
Property Value
BufferSize
public int BufferSize { get; set; }
Property Value
CQLAlpha
public T CQLAlpha { get; set; }
Property Value
- T
CQLLagrange
public bool CQLLagrange { get; set; }
Property Value
CQLNumActions
public int CQLNumActions { get; set; }
Property Value
CQLTargetActionGap
public T CQLTargetActionGap { get; set; }
Property Value
- T
DiscountFactor
public T DiscountFactor { get; set; }
Property Value
- T
GradientSteps
public int GradientSteps { get; set; }
Property Value
InitialTemperature
public T InitialTemperature { get; set; }
Property Value
- T
PolicyHiddenLayers
public List<int> PolicyHiddenLayers { get; set; }
Property Value
PolicyLearningRate
public T PolicyLearningRate { get; set; }
Property Value
- T
QHiddenLayers
public List<int> QHiddenLayers { get; set; }
Property Value
QLearningRate
public T QLearningRate { get; set; }
Property Value
- T
QLossFunction
public ILossFunction<T> QLossFunction { get; set; }
Property Value
Seed
public int? Seed { get; set; }
Property Value
- int?
StateSize
public int StateSize { get; set; }
Property Value
TargetEntropy
public T? TargetEntropy { get; set; }
Property Value
- T
TargetUpdateTau
public T TargetUpdateTau { get; set; }
Property Value
- T