Table of Contents

Class BoltzmannExploration<T>

Namespace
AiDotNet.ReinforcementLearning.Policies.Exploration
Assembly
AiDotNet.dll

Boltzmann (softmax) exploration with temperature-based action selection. Uses temperature parameter to control exploration: higher temperature = more random. Action probability: P(a) = exp(Q(a)/τ) / Σ exp(Q(a')/τ)

public class BoltzmannExploration<T> : ExplorationStrategyBase<T>, IExplorationStrategy<T>

Type Parameters

T

The numeric type used for calculations.

Inheritance
BoltzmannExploration<T>
Implements
Inherited Members

Constructors

BoltzmannExploration(double, double, double)

Initializes a new instance of the Boltzmann exploration strategy.

public BoltzmannExploration(double temperatureStart = 1, double temperatureEnd = 0.01, double temperatureDecay = 0.995)

Parameters

temperatureStart double

Initial temperature (default: 1.0).

temperatureEnd double

Minimum temperature (default: 0.01).

temperatureDecay double

Temperature decay rate per update (default: 0.995).

Properties

CurrentTemperature

Gets the current temperature value.

public double CurrentTemperature { get; }

Property Value

double

Methods

GetExplorationAction(Vector<T>, Vector<T>, int, Random)

Applies Boltzmann (softmax) exploration to select an action.

public override Vector<T> GetExplorationAction(Vector<T> state, Vector<T> policyAction, int actionSpaceSize, Random random)

Parameters

state Vector<T>
policyAction Vector<T>
actionSpaceSize int
random Random

Returns

Vector<T>

Reset()

Resets the temperature to its initial value.

public override void Reset()

Update()

Updates the temperature using exponential decay.

public override void Update()