Class REINFORCEOptions<T>
Configuration options for REINFORCE agents.
public class REINFORCEOptions<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
REINFORCEOptions<T>
- Inherited Members
Remarks
REINFORCE is the simplest policy gradient algorithm. It directly optimizes the policy by following the gradient of expected returns.
For Beginners: REINFORCE is the "hello world" of policy gradient methods. It's simple but powerful: - Play an entire episode - See which actions led to good rewards - Make those actions more likely in the future
Think of it like learning to play a game: you play a round, see your score, then adjust your strategy to do better next time.
Simple, but can be slow to learn and high variance. Modern algorithms like PPO improve on REINFORCE's ideas.
Constructors
REINFORCEOptions()
public REINFORCEOptions()
Properties
ActionSize
public int ActionSize { get; set; }
Property Value
DiscountFactor
public T DiscountFactor { get; set; }
Property Value
- T
HiddenLayers
public List<int> HiddenLayers { get; set; }
Property Value
IsContinuous
public bool IsContinuous { get; set; }
Property Value
LearningRate
public T LearningRate { get; set; }
Property Value
- T
Seed
public int? Seed { get; set; }
Property Value
- int?
StateSize
public int StateSize { get; set; }