Class UniformReplayBuffer<T, TState, TAction>

Namespace: AiDotNet.ReinforcementLearning.ReplayBuffers

Assembly: AiDotNet.dll

A replay buffer that samples experiences uniformly at random.

public class UniformReplayBuffer<T, TState, TAction> : IReplayBuffer<T, TState, TAction>

Type Parameters

T: The numeric type used for calculations (e.g., float, double).
TState: The type representing the state observation (e.g., Vector<T>, Tensor<T>).
TAction: The type representing the action (e.g., Vector<T> for continuous, int for discrete).

Inheritance: object

UniformReplayBuffer<T, TState, TAction>

Implements: IReplayBuffer<T, TState, TAction>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

This is the standard replay buffer used in algorithms like DQN. Experiences are stored in a circular buffer and sampled uniformly at random for training. All experiences have an equal probability of being selected, regardless of their importance or recency.

For Beginners: This replay buffer treats all experiences equally - it's like having a bag of memories and pulling out random ones to learn from. When the buffer is full, the oldest memories get replaced with new ones.

Key Properties:

Uniform Sampling: Every experience has an equal chance of being picked
Circular Buffer: Old experiences are automatically removed when capacity is reached
No Prioritization: Unlike prioritized replay, doesn't favor "important" experiences

When to Use:

Good starting point for most RL algorithms
Works well when all experiences are roughly equally valuable
Simpler and faster than prioritized variants

Constructors

UniformReplayBuffer(int, int?)

Initializes a new instance of the UniformReplayBuffer class.

public UniformReplayBuffer(int capacity, int? seed = null)

Parameters

capacity int: Maximum number of experiences to store.
seed int?: Optional random seed for reproducibility.

Remarks

For Beginners: Capacity determines how many experiences the buffer remembers. Larger buffers have more diverse experiences but use more memory. Common values: 10,000 for simple problems, 100,000-1,000,000 for complex ones.

Properties

Capacity

Gets the maximum capacity of the buffer.

public int Capacity { get; }

Property Value

int

Remarks

For Beginners: This is the maximum number of experiences the buffer can hold. Once full, old experiences are typically replaced with new ones (FIFO). Common values: 10,000 to 1,000,000 depending on available memory.

Count

Gets the current number of experiences in the buffer.

public int Count { get; }

Property Value

int

Remarks

For Beginners: How many experiences are currently stored. Training usually starts only when the buffer has enough experiences (e.g., Count >= BatchSize).

Methods

Add(Experience<T, TState, TAction>)

Adds an experience to the buffer.

public void Add(Experience<T, TState, TAction> experience)

Parameters

experience Experience<T, TState, TAction>: The experience to add.

Remarks

For Beginners: Call this after each step in the environment to store what happened for later learning. If the buffer is full, the oldest experience is typically removed to make room.

CanSample(int)

Checks if the buffer has enough experiences to sample a batch.

public bool CanSample(int batchSize)

Parameters

batchSize int: The desired batch size.

Returns

bool: True if buffer contains at least batchSize experiences.

Remarks

For Beginners: Check this before sampling to avoid errors. Training should wait until CanSample(batchSize) returns true.

Clear()

Clears all experiences from the buffer.

public void Clear()

Remarks

For Beginners: Removes all stored experiences. Use this when starting fresh training or when the environment changes significantly.

Sample(int)

Samples a batch of experiences from the buffer.

public List<Experience<T, TState, TAction>> Sample(int batchSize)

Parameters

batchSize int: Number of experiences to sample.

Returns

List<Experience<T, TState, TAction>>: List of sampled experiences.

Remarks

For Beginners: Randomly selects experiences for training. Random sampling breaks temporal correlations (nearby experiences being similar) which helps the neural network learn more stable patterns.

SampleWithIndices(int)

Samples a batch of experiences with their buffer indices.

public (List<Experience<T, TState, TAction>> Experiences, List<int> Indices) SampleWithIndices(int batchSize)

Parameters

batchSize int: Number of experiences to sample.

Returns

(List<Experience<T, TState, TAction>> Experiences, List<int> Indices): A tuple containing the list of sampled experiences and their corresponding buffer indices.

Remarks

This method is useful for multi-agent scenarios where additional per-agent data is stored separately and needs to be retrieved using the buffer index.

For Beginners: Sometimes you need to know where in the buffer each sampled experience came from. This method returns both the experiences and their positions, which is useful for advanced techniques like updating priorities in prioritized replay.

Table of Contents

Class UniformReplayBuffer<T, TState, TAction>

Type Parameters

Remarks

Constructors

UniformReplayBuffer(int, int?)

Parameters

Remarks

Properties

Capacity

Property Value

Remarks

Count

Property Value

Remarks

Methods

Add(Experience<T, TState, TAction>)

Parameters

Remarks

CanSample(int)

Parameters

Returns

Remarks

Clear()

Remarks

Sample(int)

Parameters

Returns

Remarks

SampleWithIndices(int)

Parameters

Returns

Remarks