Class SARSAOptions<T>
Configuration options for SARSA agents.
public class SARSAOptions<T> : ReinforcementLearningOptions<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
SARSAOptions<T>
- Inherited Members
Remarks
SARSA (State-Action-Reward-State-Action) is an on-policy TD control algorithm. Unlike Q-Learning, it updates based on the action actually taken.
For Beginners: SARSA is more conservative than Q-Learning because it learns from actions it actually takes (including exploratory ones). This makes it safer in environments where bad actions can be catastrophic.
Classic example: Cliff Walking
- Q-Learning learns the shortest path (risky, close to cliff)
- SARSA learns a safer path (further from cliff)
Use SARSA when:
- Safety matters during training
- You want to learn a safe policy
- Environment has dangerous states
Use Q-Learning when:
- You want the optimal policy
- Safety during training doesn't matter
- You can afford exploratory mistakes
Properties
ActionSize
Size of the action space (number of possible actions).
public int ActionSize { get; init; }
Property Value
StateSize
Size of the state space (number of state features).
public int StateSize { get; init; }