Class BetaPolicy<T>
- Namespace
- AiDotNet.ReinforcementLearning.Policies
- Assembly
- AiDotNet.dll
Policy using Beta distribution for bounded continuous action spaces. Network outputs alpha and beta parameters for each action dimension. Actions are naturally bounded to [0, 1] and can be scaled to any [min, max] range.
public class BetaPolicy<T> : PolicyBase<T>, IPolicy<T>, IDisposable
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
PolicyBase<T>BetaPolicy<T>
- Implements
-
IPolicy<T>
- Inherited Members
Constructors
BetaPolicy(NeuralNetwork<T>, int, IExplorationStrategy<T>, double, double, Random?)
Initializes a new instance of the BetaPolicy class.
public BetaPolicy(NeuralNetwork<T> policyNetwork, int actionSize, IExplorationStrategy<T> explorationStrategy, double actionMin = 0, double actionMax = 1, Random? random = null)
Parameters
policyNetworkNeuralNetwork<T>Network that outputs alpha and beta parameters (2 * actionSize outputs).
actionSizeintThe size of the action space.
explorationStrategyIExplorationStrategy<T>The exploration strategy.
actionMindoubleMinimum action value (default: 0.0).
actionMaxdoubleMaximum action value (default: 1.0).
randomRandomOptional random number generator.
Methods
ComputeLogProb(Vector<T>, Vector<T>)
Computes the log probability of an action under the Beta distribution policy.
public override T ComputeLogProb(Vector<T> state, Vector<T> action)
Parameters
stateVector<T>actionVector<T>
Returns
- T
GetNetworks()
Gets the neural networks used by this policy.
public override IReadOnlyList<INeuralNetwork<T>> GetNetworks()
Returns
Reset()
Resets the exploration strategy.
public override void Reset()
SelectAction(Vector<T>, bool)
Selects an action by sampling from Beta distributions.
public override Vector<T> SelectAction(Vector<T> state, bool training = true)
Parameters
stateVector<T>trainingbool
Returns
- Vector<T>