Class IQLOptions<T>
Configuration options for Implicit Q-Learning (IQL) agent.
public class IQLOptions<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
IQLOptions<T>
- Inherited Members
Remarks
IQL is an offline RL algorithm that avoids explicit policy constraints or conservative regularization. Instead, it uses expectile regression to extract a policy from the value function.
For Beginners: IQL is designed for offline learning (learning from fixed datasets). Unlike CQL which adds penalties, IQL uses a clever trick called "expectile regression" to avoid overestimation.
Key innovation:
- Expectile Regression: Focus on upper quantiles of value distribution
- Implicit Policy Extraction: No explicit max over actions
- Simpler than CQL: Fewer hyperparameters to tune
Think of it like learning the "typical good outcome" rather than the "best possible outcome" which helps avoid being too optimistic about unseen situations.
Advantages: Simpler, more stable than CQL in many cases
Constructors
IQLOptions()
public IQLOptions()
Properties
ActionSize
public int ActionSize { get; set; }
Property Value
BatchSize
public int BatchSize { get; set; }
Property Value
BufferSize
public int BufferSize { get; set; }
Property Value
DiscountFactor
public T DiscountFactor { get; set; }
Property Value
- T
Expectile
public double Expectile { get; set; }
Property Value
PolicyHiddenLayers
public List<int> PolicyHiddenLayers { get; set; }
Property Value
PolicyLearningRate
public T PolicyLearningRate { get; set; }
Property Value
- T
QHiddenLayers
public List<int> QHiddenLayers { get; set; }
Property Value
QLearningRate
public T QLearningRate { get; set; }
Property Value
- T
QLossFunction
public ILossFunction<T> QLossFunction { get; set; }
Property Value
Seed
public int? Seed { get; set; }
Property Value
- int?
StateSize
public int StateSize { get; set; }
Property Value
TargetUpdateTau
public T TargetUpdateTau { get; set; }
Property Value
- T
Temperature
public T Temperature { get; set; }
Property Value
- T
ValueHiddenLayers
public List<int> ValueHiddenLayers { get; set; }
Property Value
ValueLearningRate
public T ValueLearningRate { get; set; }
Property Value
- T
Methods
Validate()
Validates that required properties are set.
public void Validate()