Class BoltzmannExploration<T>
- Namespace
- AiDotNet.ReinforcementLearning.Policies.Exploration
- Assembly
- AiDotNet.dll
Boltzmann (softmax) exploration with temperature-based action selection. Uses temperature parameter to control exploration: higher temperature = more random. Action probability: P(a) = exp(Q(a)/τ) / Σ exp(Q(a')/τ)
public class BoltzmannExploration<T> : ExplorationStrategyBase<T>, IExplorationStrategy<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
BoltzmannExploration<T>
- Implements
- Inherited Members
Constructors
BoltzmannExploration(double, double, double)
Initializes a new instance of the Boltzmann exploration strategy.
public BoltzmannExploration(double temperatureStart = 1, double temperatureEnd = 0.01, double temperatureDecay = 0.995)
Parameters
temperatureStartdoubleInitial temperature (default: 1.0).
temperatureEnddoubleMinimum temperature (default: 0.01).
temperatureDecaydoubleTemperature decay rate per update (default: 0.995).
Properties
CurrentTemperature
Gets the current temperature value.
public double CurrentTemperature { get; }
Property Value
Methods
GetExplorationAction(Vector<T>, Vector<T>, int, Random)
Applies Boltzmann (softmax) exploration to select an action.
public override Vector<T> GetExplorationAction(Vector<T> state, Vector<T> policyAction, int actionSpaceSize, Random random)
Parameters
Returns
- Vector<T>
Reset()
Resets the temperature to its initial value.
public override void Reset()
Update()
Updates the temperature using exponential decay.
public override void Update()