Interface IReplayBuffer<T, TState, TAction>
- Namespace
- AiDotNet.ReinforcementLearning.ReplayBuffers
- Assembly
- AiDotNet.dll
Interface for experience replay buffers used in reinforcement learning.
public interface IReplayBuffer<T, TState, TAction>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
TStateThe type representing the state observation (e.g., Vector<T>, Tensor<T>).
TActionThe type representing the action (e.g., Vector<T> for continuous, int for discrete).
Remarks
Experience replay is a technique where the agent stores past experiences and learns from them multiple times. This breaks temporal correlations and improves sample efficiency. The replay buffer stores experience tuples (state, action, reward, next_state, done) and provides random sampling for training.
For Beginners: A replay buffer is like a memory bank for the agent. Instead of learning only from the most recent experience, the agent stores experiences and learns from random samples of past experiences. This makes learning more stable and efficient.
Think of it like studying for an exam:
- You don't just study the most recent lesson
- You review random material from throughout the course
- This helps you learn connections between different topics
- And prevents forgetting older material
Common Buffer Types:
- Uniform: All experiences sampled with equal probability
- Prioritized: Important experiences (big errors) sampled more often
Properties
Capacity
Gets the maximum capacity of the buffer.
int Capacity { get; }
Property Value
Remarks
For Beginners: This is the maximum number of experiences the buffer can hold. Once full, old experiences are typically replaced with new ones (FIFO). Common values: 10,000 to 1,000,000 depending on available memory.
Count
Gets the current number of experiences in the buffer.
int Count { get; }
Property Value
Remarks
For Beginners: How many experiences are currently stored. Training usually starts only when the buffer has enough experiences (e.g., Count >= BatchSize).
Methods
Add(Experience<T, TState, TAction>)
Adds an experience to the buffer.
void Add(Experience<T, TState, TAction> experience)
Parameters
experienceExperience<T, TState, TAction>The experience to add.
Remarks
For Beginners: Call this after each step in the environment to store what happened for later learning. If the buffer is full, the oldest experience is typically removed to make room.
CanSample(int)
Checks if the buffer has enough experiences to sample a batch.
bool CanSample(int batchSize)
Parameters
batchSizeintThe desired batch size.
Returns
- bool
True if buffer contains at least batchSize experiences.
Remarks
For Beginners: Check this before sampling to avoid errors. Training should wait until CanSample(batchSize) returns true.
Clear()
Clears all experiences from the buffer.
void Clear()
Remarks
For Beginners: Removes all stored experiences. Use this when starting fresh training or when the environment changes significantly.
Sample(int)
Samples a batch of experiences from the buffer.
List<Experience<T, TState, TAction>> Sample(int batchSize)
Parameters
batchSizeintNumber of experiences to sample.
Returns
- List<Experience<T, TState, TAction>>
List of sampled experiences.
Remarks
For Beginners: Randomly selects experiences for training. Random sampling breaks temporal correlations (nearby experiences being similar) which helps the neural network learn more stable patterns.