Class UniformReplayBuffer<T, TState, TAction>
- Namespace
- AiDotNet.ReinforcementLearning.ReplayBuffers
- Assembly
- AiDotNet.dll
A replay buffer that samples experiences uniformly at random.
public class UniformReplayBuffer<T, TState, TAction> : IReplayBuffer<T, TState, TAction>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
TStateThe type representing the state observation (e.g., Vector<T>, Tensor<T>).
TActionThe type representing the action (e.g., Vector<T> for continuous, int for discrete).
- Inheritance
-
UniformReplayBuffer<T, TState, TAction>
- Implements
-
IReplayBuffer<T, TState, TAction>
- Inherited Members
Remarks
This is the standard replay buffer used in algorithms like DQN. Experiences are stored in a circular buffer and sampled uniformly at random for training. All experiences have an equal probability of being selected, regardless of their importance or recency.
For Beginners: This replay buffer treats all experiences equally - it's like having a bag of memories and pulling out random ones to learn from. When the buffer is full, the oldest memories get replaced with new ones.
Key Properties:
- Uniform Sampling: Every experience has an equal chance of being picked
- Circular Buffer: Old experiences are automatically removed when capacity is reached
- No Prioritization: Unlike prioritized replay, doesn't favor "important" experiences
When to Use:
- Good starting point for most RL algorithms
- Works well when all experiences are roughly equally valuable
- Simpler and faster than prioritized variants
Constructors
UniformReplayBuffer(int, int?)
Initializes a new instance of the UniformReplayBuffer class.
public UniformReplayBuffer(int capacity, int? seed = null)
Parameters
capacityintMaximum number of experiences to store.
seedint?Optional random seed for reproducibility.
Remarks
For Beginners: Capacity determines how many experiences the buffer remembers. Larger buffers have more diverse experiences but use more memory. Common values: 10,000 for simple problems, 100,000-1,000,000 for complex ones.
Properties
Capacity
Gets the maximum capacity of the buffer.
public int Capacity { get; }
Property Value
Remarks
For Beginners: This is the maximum number of experiences the buffer can hold. Once full, old experiences are typically replaced with new ones (FIFO). Common values: 10,000 to 1,000,000 depending on available memory.
Count
Gets the current number of experiences in the buffer.
public int Count { get; }
Property Value
Remarks
For Beginners: How many experiences are currently stored. Training usually starts only when the buffer has enough experiences (e.g., Count >= BatchSize).
Methods
Add(Experience<T, TState, TAction>)
Adds an experience to the buffer.
public void Add(Experience<T, TState, TAction> experience)
Parameters
experienceExperience<T, TState, TAction>The experience to add.
Remarks
For Beginners: Call this after each step in the environment to store what happened for later learning. If the buffer is full, the oldest experience is typically removed to make room.
CanSample(int)
Checks if the buffer has enough experiences to sample a batch.
public bool CanSample(int batchSize)
Parameters
batchSizeintThe desired batch size.
Returns
- bool
True if buffer contains at least batchSize experiences.
Remarks
For Beginners: Check this before sampling to avoid errors. Training should wait until CanSample(batchSize) returns true.
Clear()
Clears all experiences from the buffer.
public void Clear()
Remarks
For Beginners: Removes all stored experiences. Use this when starting fresh training or when the environment changes significantly.
Sample(int)
Samples a batch of experiences from the buffer.
public List<Experience<T, TState, TAction>> Sample(int batchSize)
Parameters
batchSizeintNumber of experiences to sample.
Returns
- List<Experience<T, TState, TAction>>
List of sampled experiences.
Remarks
For Beginners: Randomly selects experiences for training. Random sampling breaks temporal correlations (nearby experiences being similar) which helps the neural network learn more stable patterns.
SampleWithIndices(int)
Samples a batch of experiences with their buffer indices.
public (List<Experience<T, TState, TAction>> Experiences, List<int> Indices) SampleWithIndices(int batchSize)
Parameters
batchSizeintNumber of experiences to sample.
Returns
- (List<Experience<T, TState, TAction>> Experiences, List<int> Indices)
A tuple containing the list of sampled experiences and their corresponding buffer indices.
Remarks
This method is useful for multi-agent scenarios where additional per-agent data is stored separately and needs to be retrieved using the buffer index.
For Beginners: Sometimes you need to know where in the buffer each sampled experience came from. This method returns both the experiences and their positions, which is useful for advanced techniques like updating priorities in prioritized replay.