Table of Contents

Interface IReplayBuffer<T, TState, TAction>

Namespace
AiDotNet.ReinforcementLearning.ReplayBuffers
Assembly
AiDotNet.dll

Interface for experience replay buffers used in reinforcement learning.

public interface IReplayBuffer<T, TState, TAction>

Type Parameters

T

The numeric type used for calculations (e.g., float, double).

TState

The type representing the state observation (e.g., Vector<T>, Tensor<T>).

TAction

The type representing the action (e.g., Vector<T> for continuous, int for discrete).

Remarks

Experience replay is a technique where the agent stores past experiences and learns from them multiple times. This breaks temporal correlations and improves sample efficiency. The replay buffer stores experience tuples (state, action, reward, next_state, done) and provides random sampling for training.

For Beginners: A replay buffer is like a memory bank for the agent. Instead of learning only from the most recent experience, the agent stores experiences and learns from random samples of past experiences. This makes learning more stable and efficient.

Think of it like studying for an exam:

  • You don't just study the most recent lesson
  • You review random material from throughout the course
  • This helps you learn connections between different topics
  • And prevents forgetting older material

Common Buffer Types:

  • Uniform: All experiences sampled with equal probability
  • Prioritized: Important experiences (big errors) sampled more often

Properties

Capacity

Gets the maximum capacity of the buffer.

int Capacity { get; }

Property Value

int

Remarks

For Beginners: This is the maximum number of experiences the buffer can hold. Once full, old experiences are typically replaced with new ones (FIFO). Common values: 10,000 to 1,000,000 depending on available memory.

Count

Gets the current number of experiences in the buffer.

int Count { get; }

Property Value

int

Remarks

For Beginners: How many experiences are currently stored. Training usually starts only when the buffer has enough experiences (e.g., Count >= BatchSize).

Methods

Add(Experience<T, TState, TAction>)

Adds an experience to the buffer.

void Add(Experience<T, TState, TAction> experience)

Parameters

experience Experience<T, TState, TAction>

The experience to add.

Remarks

For Beginners: Call this after each step in the environment to store what happened for later learning. If the buffer is full, the oldest experience is typically removed to make room.

CanSample(int)

Checks if the buffer has enough experiences to sample a batch.

bool CanSample(int batchSize)

Parameters

batchSize int

The desired batch size.

Returns

bool

True if buffer contains at least batchSize experiences.

Remarks

For Beginners: Check this before sampling to avoid errors. Training should wait until CanSample(batchSize) returns true.

Clear()

Clears all experiences from the buffer.

void Clear()

Remarks

For Beginners: Removes all stored experiences. Use this when starting fresh training or when the environment changes significantly.

Sample(int)

Samples a batch of experiences from the buffer.

List<Experience<T, TState, TAction>> Sample(int batchSize)

Parameters

batchSize int

Number of experiences to sample.

Returns

List<Experience<T, TState, TAction>>

List of sampled experiences.

Remarks

For Beginners: Randomly selects experiences for training. Random sampling breaks temporal correlations (nearby experiences being similar) which helps the neural network learn more stable patterns.