Class ThompsonSamplingExploration<T>
- Namespace
- AiDotNet.ReinforcementLearning.Policies.Exploration
- Assembly
- AiDotNet.dll
Thompson Sampling (Bayesian) exploration for discrete action spaces. Maintains Beta distributions for each action and samples from posteriors.
public class ThompsonSamplingExploration<T> : ExplorationStrategyBase<T>, IExplorationStrategy<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
ThompsonSamplingExploration<T>
- Implements
- Inherited Members
Constructors
ThompsonSamplingExploration(double, double)
Initializes a new instance of the Thompson Sampling exploration strategy.
public ThompsonSamplingExploration(double priorAlpha = 1, double priorBeta = 1)
Parameters
priorAlphadoublePrior alpha parameter for Beta distribution (default: 1.0).
priorBetadoublePrior beta parameter for Beta distribution (default: 1.0).
Methods
GetExplorationAction(Vector<T>, Vector<T>, int, Random)
Selects action by sampling from Beta posteriors for each action.
public override Vector<T> GetExplorationAction(Vector<T> state, Vector<T> policyAction, int actionSpaceSize, Random random)
Parameters
Returns
- Vector<T>
Reset()
Resets all action distributions to prior.
public override void Reset()
Update()
Updates internal parameters (call UpdateDistribution separately for each action).
public override void Update()
UpdateDistribution(int, double)
Updates the Beta distribution for a specific action based on reward.
public void UpdateDistribution(int actionIndex, double reward)