Class A3COptions<T>
Configuration options for Asynchronous Advantage Actor-Critic (A3C) agents.
public class A3COptions<T> : ReinforcementLearningOptions<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
A3COptions<T>
- Inherited Members
Remarks
A3C runs multiple agents in parallel, each learning from different experiences. The parallel exploration provides diverse training data and stabilizes learning.
For Beginners: A3C is like having multiple students learn the same subject simultaneously, each with different experiences. They periodically share what they learned with a central "teacher" (global network), and everyone benefits from the combined knowledge.
Key features:
- Asynchronous: Multiple agents run in parallel
- Actor-Critic: Learns both policy and value function
- No Replay Buffer: Uses on-policy learning
- Diverse Exploration: Different agents explore different strategies
Famous for: DeepMind's breakthrough paper (2016), enables CPU-only training
Constructors
A3COptions()
public A3COptions()
Properties
ActionSize
public int ActionSize { get; init; }
Property Value
EntropyCoefficient
public T EntropyCoefficient { get; init; }
Property Value
- T
IsContinuous
public bool IsContinuous { get; init; }
Property Value
NumWorkers
public int NumWorkers { get; init; }
Property Value
Optimizer
The optimizer used for updating network parameters. If null, Adam optimizer will be used by default.
public IOptimizer<T, Vector<T>, Vector<T>>? Optimizer { get; init; }
Property Value
- IOptimizer<T, Vector<T>, Vector<T>>
PolicyHiddenLayers
public List<int> PolicyHiddenLayers { get; init; }
Property Value
PolicyLearningRate
public T PolicyLearningRate { get; init; }
Property Value
- T
StateSize
public int StateSize { get; init; }
Property Value
TMax
public int TMax { get; init; }
Property Value
ValueHiddenLayers
public List<int> ValueHiddenLayers { get; init; }
Property Value
ValueLearningRate
public T ValueLearningRate { get; init; }
Property Value
- T
ValueLossCoefficient
public T ValueLossCoefficient { get; init; }
Property Value
- T
ValueLossFunction
public ILossFunction<T> ValueLossFunction { get; init; }