Class Experience<T, TState, TAction>

Namespace: AiDotNet.ReinforcementLearning.ReplayBuffers

Assembly: AiDotNet.dll

Represents a single experience tuple (s, a, r, s', done) for reinforcement learning.

public record Experience<T, TState, TAction> : IEquatable<Experience<T, TState, TAction>>

Type Parameters

T: The numeric type used for calculations (e.g., float, double).
TState: The type representing the state observation (e.g., Vector<T>, Tensor<T>).
TAction: The type representing the action (e.g., Vector<T> for continuous, int for discrete).

Inheritance: object

Experience<T, TState, TAction>

Implements: IEquatable<Experience<T, TState, TAction>>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

An Experience is a fundamental data structure in reinforcement learning that captures a single interaction between an agent and its environment. It consists of five components: the current state, the action taken, the reward received, the resulting next state, and a flag indicating whether the episode has ended. This tuple is used to train reinforcement learning agents in algorithms like Q-learning, Deep Q-Networks (DQN), PPO, and many others.

For Beginners: An experience is one step of interaction with the environment. It contains everything the agent needs to learn from that step:

State: What the situation looked like before the agent acted (like a snapshot)
Action: What the agent decided to do
Reward: The feedback received (positive = good, negative = bad, zero = neutral)
NextState: What the situation looks like after the action
Done: Whether this action ended the episode (game over, goal reached, etc.)

For example, in a maze-solving robot:

State: Robot's current position and sensor readings
Action: "move forward" or "turn left"
Reward: +10 for reaching the exit, -1 for hitting a wall, 0 otherwise
NextState: Robot's new position after the action
Done: True if robot reached the exit or got stuck

Common Type Combinations:

Experience<double, Vector<double>, Vector<double>> - For continuous actions (e.g., robotic control)
Experience<double, Vector<double>, int> - For discrete actions (e.g., game playing)
Experience<float, Tensor<float>, int> - For image-based states (e.g., Atari games)

Constructors

Experience(TState, TAction, T, TState, bool)

Represents a single experience tuple (s, a, r, s', done) for reinforcement learning.

public Experience(TState State, TAction Action, T Reward, TState NextState, bool Done)

Parameters

State TState
Action TAction
Reward T
NextState TState
Done bool

Remarks

For Beginners: An experience is one step of interaction with the environment. It contains everything the agent needs to learn from that step:

State: What the situation looked like before the agent acted (like a snapshot)
Action: What the agent decided to do
Reward: The feedback received (positive = good, negative = bad, zero = neutral)
NextState: What the situation looks like after the action
Done: Whether this action ended the episode (game over, goal reached, etc.)

For example, in a maze-solving robot:

State: Robot's current position and sensor readings
Action: "move forward" or "turn left"
Reward: +10 for reaching the exit, -1 for hitting a wall, 0 otherwise
NextState: Robot's new position after the action
Done: True if robot reached the exit or got stuck

Common Type Combinations:

Experience<double, Vector<double>, Vector<double>> - For continuous actions (e.g., robotic control)
Experience<double, Vector<double>, int> - For discrete actions (e.g., game playing)
Experience<float, Tensor<float>, int> - For image-based states (e.g., Atari games)

Properties

Action

public TAction Action { get; init; }

Property Value

TAction

Done

public bool Done { get; init; }

Property Value

bool

NextState

public TState NextState { get; init; }

Property Value

TState

Priority

Gets or sets the priority for prioritized experience replay.

public double Priority { get; set; }

Property Value

double: A double representing the experience's sampling priority. Default is 1.0.

Remarks

In prioritized experience replay, experiences with higher priority are sampled more frequently. The priority is typically based on the TD-error (temporal difference error), meaning experiences that surprise the agent (large prediction errors) are replayed more often.

For Beginners: Priority determines how often this experience gets picked for learning.

Think of it like highlighting important notes in a textbook:

Higher priority = more important = reviewed more often
Experiences where the agent made big mistakes get higher priority
This helps the agent learn from its most surprising or educational moments

Default is 1.0 (all experiences equal). Values greater than 1.0 mean "sample this more often."

Reward

public T Reward { get; init; }

Property Value

T

State

public TState State { get; init; }

Property Value

TState

Table of Contents

Class Experience<T, TState, TAction>

Type Parameters

Remarks

Constructors

Experience(TState, TAction, T, TState, bool)

Parameters

Remarks

Properties

Action

Property Value

Done

Property Value

NextState

Property Value

Priority

Property Value

Remarks

Reward

Property Value

State

Property Value