Class Experience<T, TState, TAction>
- Namespace
- AiDotNet.ReinforcementLearning.ReplayBuffers
- Assembly
- AiDotNet.dll
Represents a single experience tuple (s, a, r, s', done) for reinforcement learning.
public record Experience<T, TState, TAction> : IEquatable<Experience<T, TState, TAction>>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
TStateThe type representing the state observation (e.g., Vector<T>, Tensor<T>).
TActionThe type representing the action (e.g., Vector<T> for continuous, int for discrete).
- Inheritance
-
Experience<T, TState, TAction>
- Implements
-
IEquatable<Experience<T, TState, TAction>>
- Inherited Members
Remarks
An Experience is a fundamental data structure in reinforcement learning that captures a single interaction between an agent and its environment. It consists of five components: the current state, the action taken, the reward received, the resulting next state, and a flag indicating whether the episode has ended. This tuple is used to train reinforcement learning agents in algorithms like Q-learning, Deep Q-Networks (DQN), PPO, and many others.
For Beginners: An experience is one step of interaction with the environment. It contains everything the agent needs to learn from that step:
- State: What the situation looked like before the agent acted (like a snapshot)
- Action: What the agent decided to do
- Reward: The feedback received (positive = good, negative = bad, zero = neutral)
- NextState: What the situation looks like after the action
- Done: Whether this action ended the episode (game over, goal reached, etc.)
For example, in a maze-solving robot:
- State: Robot's current position and sensor readings
- Action: "move forward" or "turn left"
- Reward: +10 for reaching the exit, -1 for hitting a wall, 0 otherwise
- NextState: Robot's new position after the action
- Done: True if robot reached the exit or got stuck
Common Type Combinations:
Experience<double, Vector<double>, Vector<double>>- For continuous actions (e.g., robotic control)Experience<double, Vector<double>, int>- For discrete actions (e.g., game playing)Experience<float, Tensor<float>, int>- For image-based states (e.g., Atari games)
Constructors
Experience(TState, TAction, T, TState, bool)
Represents a single experience tuple (s, a, r, s', done) for reinforcement learning.
public Experience(TState State, TAction Action, T Reward, TState NextState, bool Done)
Parameters
StateTStateActionTActionRewardTNextStateTStateDonebool
Remarks
An Experience is a fundamental data structure in reinforcement learning that captures a single interaction between an agent and its environment. It consists of five components: the current state, the action taken, the reward received, the resulting next state, and a flag indicating whether the episode has ended. This tuple is used to train reinforcement learning agents in algorithms like Q-learning, Deep Q-Networks (DQN), PPO, and many others.
For Beginners: An experience is one step of interaction with the environment. It contains everything the agent needs to learn from that step:
- State: What the situation looked like before the agent acted (like a snapshot)
- Action: What the agent decided to do
- Reward: The feedback received (positive = good, negative = bad, zero = neutral)
- NextState: What the situation looks like after the action
- Done: Whether this action ended the episode (game over, goal reached, etc.)
For example, in a maze-solving robot:
- State: Robot's current position and sensor readings
- Action: "move forward" or "turn left"
- Reward: +10 for reaching the exit, -1 for hitting a wall, 0 otherwise
- NextState: Robot's new position after the action
- Done: True if robot reached the exit or got stuck
Common Type Combinations:
Experience<double, Vector<double>, Vector<double>>- For continuous actions (e.g., robotic control)Experience<double, Vector<double>, int>- For discrete actions (e.g., game playing)Experience<float, Tensor<float>, int>- For image-based states (e.g., Atari games)
Properties
Action
public TAction Action { get; init; }
Property Value
- TAction
Done
public bool Done { get; init; }
Property Value
NextState
public TState NextState { get; init; }
Property Value
- TState
Priority
Gets or sets the priority for prioritized experience replay.
public double Priority { get; set; }
Property Value
- double
A double representing the experience's sampling priority. Default is 1.0.
Remarks
In prioritized experience replay, experiences with higher priority are sampled more frequently. The priority is typically based on the TD-error (temporal difference error), meaning experiences that surprise the agent (large prediction errors) are replayed more often.
For Beginners: Priority determines how often this experience gets picked for learning.
Think of it like highlighting important notes in a textbook:
- Higher priority = more important = reviewed more often
- Experiences where the agent made big mistakes get higher priority
- This helps the agent learn from its most surprising or educational moments
Default is 1.0 (all experiences equal). Values greater than 1.0 mean "sample this more often."
Reward
public T Reward { get; init; }
Property Value
- T
State
public TState State { get; init; }
Property Value
- TState