Table of Contents

Class BetaPolicy<T>

Namespace
AiDotNet.ReinforcementLearning.Policies
Assembly
AiDotNet.dll

Policy using Beta distribution for bounded continuous action spaces. Network outputs alpha and beta parameters for each action dimension. Actions are naturally bounded to [0, 1] and can be scaled to any [min, max] range.

public class BetaPolicy<T> : PolicyBase<T>, IPolicy<T>, IDisposable

Type Parameters

T

The numeric type used for calculations.

Inheritance
BetaPolicy<T>
Implements
Inherited Members

Constructors

BetaPolicy(NeuralNetwork<T>, int, IExplorationStrategy<T>, double, double, Random?)

Initializes a new instance of the BetaPolicy class.

public BetaPolicy(NeuralNetwork<T> policyNetwork, int actionSize, IExplorationStrategy<T> explorationStrategy, double actionMin = 0, double actionMax = 1, Random? random = null)

Parameters

policyNetwork NeuralNetwork<T>

Network that outputs alpha and beta parameters (2 * actionSize outputs).

actionSize int

The size of the action space.

explorationStrategy IExplorationStrategy<T>

The exploration strategy.

actionMin double

Minimum action value (default: 0.0).

actionMax double

Maximum action value (default: 1.0).

random Random

Optional random number generator.

Methods

ComputeLogProb(Vector<T>, Vector<T>)

Computes the log probability of an action under the Beta distribution policy.

public override T ComputeLogProb(Vector<T> state, Vector<T> action)

Parameters

state Vector<T>
action Vector<T>

Returns

T

GetNetworks()

Gets the neural networks used by this policy.

public override IReadOnlyList<INeuralNetwork<T>> GetNetworks()

Returns

IReadOnlyList<INeuralNetwork<T>>

Reset()

Resets the exploration strategy.

public override void Reset()

SelectAction(Vector<T>, bool)

Selects an action by sampling from Beta distributions.

public override Vector<T> SelectAction(Vector<T> state, bool training = true)

Parameters

state Vector<T>
training bool

Returns

Vector<T>