Class MixedPolicy<T>

Namespace: AiDotNet.ReinforcementLearning.Policies

Assembly: AiDotNet.dll

Policy for environments with both discrete and continuous action spaces. Outputs both categorical distribution for discrete actions and Gaussian for continuous actions. Common in robotics where you have discrete mode selection and continuous parameter control.

public class MixedPolicy<T> : PolicyBase<T>, IPolicy<T>, IDisposable

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

PolicyBase<T>

MixedPolicy<T>

Implements: IPolicy<T>

IDisposable

Inherited Members: PolicyBase<T>.NumOps

PolicyBase<T>._random

PolicyBase<T>._disposed

PolicyBase<T>.ValidateActionSize(int, int, string)

PolicyBase<T>.ValidateState(Vector<T>, string)

PolicyBase<T>.Dispose(bool)

PolicyBase<T>.Dispose()

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Constructors

MixedPolicy(NeuralNetwork<T>, NeuralNetwork<T>, int, int, IExplorationStrategy<T>, IExplorationStrategy<T>, bool, Random?)

Initializes a new instance of the MixedPolicy class.

public MixedPolicy(NeuralNetwork<T> discreteNetwork, NeuralNetwork<T> continuousNetwork, int discreteActionSize, int continuousActionSize, IExplorationStrategy<T> discreteExploration, IExplorationStrategy<T> continuousExploration, bool sharedFeatures = false, Random? random = null)

Parameters

discreteNetwork NeuralNetwork<T>: Network for discrete action logits.
continuousNetwork NeuralNetwork<T>: Network for continuous action parameters (mean and log_std).
discreteActionSize int: Number of discrete actions.
continuousActionSize int: Number of continuous action dimensions.
discreteExploration IExplorationStrategy<T>: Exploration strategy for discrete actions.
continuousExploration IExplorationStrategy<T>: Exploration strategy for continuous actions.
sharedFeatures bool: Whether networks share feature extraction layers.
random Random: Optional random number generator.

Methods

ComputeLogProb(Vector<T>, Vector<T>)

Computes log probability of mixed action.

public override T ComputeLogProb(Vector<T> state, Vector<T> action)

Parameters

state Vector<T>
action Vector<T>

Returns

T

GetNetworks()

Gets the neural networks used by this policy.

public override IReadOnlyList<INeuralNetwork<T>> GetNetworks()

Returns

IReadOnlyList<INeuralNetwork<T>>

Reset()

Resets both exploration strategies.

public override void Reset()

SelectAction(Vector<T>, bool)

Selects mixed action: [discrete_action, continuous_actions]

public override Vector<T> SelectAction(Vector<T> state, bool training = true)

Parameters

state Vector<T>
training bool

Returns

Vector<T>

Table of Contents

Class MixedPolicy<T>

Type Parameters

Constructors

MixedPolicy(NeuralNetwork<T>, NeuralNetwork<T>, int, int, IExplorationStrategy<T>, IExplorationStrategy<T>, bool, Random?)

Parameters

Methods

ComputeLogProb(Vector<T>, Vector<T>)

Parameters

Returns

GetNetworks()

Returns

Reset()

SelectAction(Vector<T>, bool)

Parameters

Returns