Class A2COptions<T>
Configuration options for Advantage Actor-Critic (A2C) agents.
public class A2COptions<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
A2COptions<T>
- Inherited Members
Remarks
A2C is a synchronous version of A3C that is simpler and often more sample-efficient. It combines policy gradients with value function learning for stable, efficient training.
For Beginners: A2C learns two things simultaneously: - **Actor (Policy)**: What action to take in each state - **Critic (Value Function)**: How good each state is
The critic helps the actor learn faster by providing better feedback than just rewards alone. Think of the critic as a coach giving targeted advice rather than just "good" or "bad".
A2C is the foundation for many modern RL algorithms including PPO.
Constructors
A2COptions()
public A2COptions()
Properties
ActionSize
public int ActionSize { get; set; }
Property Value
DiscountFactor
public T DiscountFactor { get; set; }
Property Value
- T
EntropyCoefficient
public T EntropyCoefficient { get; set; }
Property Value
- T
IsContinuous
public bool IsContinuous { get; set; }
Property Value
PolicyHiddenLayers
public List<int> PolicyHiddenLayers { get; set; }
Property Value
PolicyLearningRate
public T PolicyLearningRate { get; set; }
Property Value
- T
Seed
public int? Seed { get; set; }
Property Value
- int?
StateSize
public int StateSize { get; set; }
Property Value
StepsPerUpdate
public int StepsPerUpdate { get; set; }
Property Value
ValueHiddenLayers
public List<int> ValueHiddenLayers { get; set; }
Property Value
ValueLearningRate
public T ValueLearningRate { get; set; }
Property Value
- T
ValueLossCoefficient
public T ValueLossCoefficient { get; set; }
Property Value
- T
ValueLossFunction
public ILossFunction<T> ValueLossFunction { get; set; }