Table of Contents

Class DeterministicPolicy<T>

Namespace
AiDotNet.ReinforcementLearning.Policies
Assembly
AiDotNet.dll

Deterministic policy for continuous action spaces. Directly outputs actions without sampling from a distribution. Commonly used in DDPG, TD3, and other deterministic policy gradient methods.

public class DeterministicPolicy<T> : PolicyBase<T>, IPolicy<T>, IDisposable

Type Parameters

T

The numeric type used for calculations.

Inheritance
DeterministicPolicy<T>
Implements
Inherited Members

Constructors

DeterministicPolicy(NeuralNetwork<T>, int, IExplorationStrategy<T>, bool, Random?)

Initializes a new instance of the DeterministicPolicy class.

public DeterministicPolicy(NeuralNetwork<T> policyNetwork, int actionSize, IExplorationStrategy<T> explorationStrategy, bool useTanhSquashing = true, Random? random = null)

Parameters

policyNetwork NeuralNetwork<T>

The neural network that outputs actions.

actionSize int

The size of the action space.

explorationStrategy IExplorationStrategy<T>

The exploration strategy for training.

useTanhSquashing bool

Whether to apply tanh squashing to bound actions to [-1, 1].

random Random

Optional random number generator.

Methods

ComputeLogProb(Vector<T>, Vector<T>)

Computes log probability for a deterministic policy. This returns a constant (zero) since deterministic policies have delta distribution.

public override T ComputeLogProb(Vector<T> state, Vector<T> action)

Parameters

state Vector<T>
action Vector<T>

Returns

T

Dispose(bool)

Disposes of policy resources.

protected override void Dispose(bool disposing)

Parameters

disposing bool

GetNetworks()

Gets the neural networks used by this policy.

public override IReadOnlyList<INeuralNetwork<T>> GetNetworks()

Returns

IReadOnlyList<INeuralNetwork<T>>

Reset()

Resets the exploration strategy.

public override void Reset()

SelectAction(Vector<T>, bool)

Selects a deterministic action from the policy network.

public override Vector<T> SelectAction(Vector<T> state, bool training = true)

Parameters

state Vector<T>
training bool

Returns

Vector<T>