Class MultiModalPolicy<T>

Namespace: AiDotNet.ReinforcementLearning.Policies

Assembly: AiDotNet.dll

Multi-modal policy using mixture of Gaussians for complex action distributions.

public class MultiModalPolicy<T> : PolicyBase<T>, IPolicy<T>, IDisposable

Type Parameters

T

Inheritance: object

PolicyBase<T>

MultiModalPolicy<T>

Implements: IPolicy<T>

IDisposable

Inherited Members: PolicyBase<T>.NumOps

PolicyBase<T>._random

PolicyBase<T>._disposed

PolicyBase<T>.ValidateActionSize(int, int, string)

PolicyBase<T>.ValidateState(Vector<T>, string)

PolicyBase<T>.Dispose(bool)

PolicyBase<T>.Dispose()

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Constructors

MultiModalPolicy(NeuralNetwork<T>, int, int, IExplorationStrategy<T>, Random?)

public MultiModalPolicy(NeuralNetwork<T> policyNetwork, int actionSize, int numComponents, IExplorationStrategy<T> explorationStrategy, Random? random = null)

Parameters

policyNetwork NeuralNetwork<T>
actionSize int
numComponents int
explorationStrategy IExplorationStrategy<T>
random Random

Methods

ComputeLogProb(Vector<T>, Vector<T>)

Computes the log probability of a given action in a given state. Used by policy gradient methods (PPO, A2C, etc.).

public override T ComputeLogProb(Vector<T> state, Vector<T> action)

Parameters

state Vector<T>: The state observation.
action Vector<T>: The action taken.

Returns

T: The log probability of the action.

GetNetworks()

Gets the neural networks used by this policy.

public override IReadOnlyList<INeuralNetwork<T>> GetNetworks()

Returns

IReadOnlyList<INeuralNetwork<T>>: A read-only list of neural networks.

Reset()

Resets any internal state (e.g., for recurrent policies, exploration noise).

public override void Reset()

SelectAction(Vector<T>, bool)

Selects an action given the current state.

public override Vector<T> SelectAction(Vector<T> state, bool training = true)

Parameters

state Vector<T>: The current state observation.
training bool: Whether the agent is training (enables exploration).

Returns

Vector<T>: The selected action vector.

Table of Contents

Class MultiModalPolicy<T>

Type Parameters

Constructors

MultiModalPolicy(NeuralNetwork<T>, int, int, IExplorationStrategy<T>, Random?)

Parameters

Methods

ComputeLogProb(Vector<T>, Vector<T>)

Parameters

Returns

GetNetworks()

Returns

Reset()

SelectAction(Vector<T>, bool)

Parameters

Returns