Class RainbowDQNAgent<T>

Namespace: AiDotNet.ReinforcementLearning.Agents.Rainbow

Assembly: AiDotNet.dll

Rainbow DQN agent combining six extensions to DQN.

public class RainbowDQNAgent<T> : DeepReinforcementLearningAgentBase<T>, IRLAgent<T>, IFullModel<T, Vector<T>, Vector<T>>, IModel<Vector<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Vector<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Vector<T>, Vector<T>>>, IGradientComputable<T, Vector<T>, Vector<T>>, IJitCompilable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

ReinforcementLearningAgentBase<T>

DeepReinforcementLearningAgentBase<T>

RainbowDQNAgent<T>

Implements: IRLAgent<T>

IFullModel<T, Vector<T>, Vector<T>>

IModel<Vector<T>, Vector<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Vector<T>, Vector<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Vector<T>, Vector<T>>>

IGradientComputable<T, Vector<T>, Vector<T>>

IJitCompilable<T>

IDisposable

Inherited Members: DeepReinforcementLearningAgentBase<T>.Networks

DeepReinforcementLearningAgentBase<T>.ParameterCount

DeepReinforcementLearningAgentBase<T>.Dispose()

DeepReinforcementLearningAgentBase<T>.GetPolicyNetworkForJit()

DeepReinforcementLearningAgentBase<T>.SupportsJitCompilation

DeepReinforcementLearningAgentBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

ReinforcementLearningAgentBase<T>.NumOps

ReinforcementLearningAgentBase<T>.Random

ReinforcementLearningAgentBase<T>.LossFunction

ReinforcementLearningAgentBase<T>.LearningRate

ReinforcementLearningAgentBase<T>.DiscountFactor

ReinforcementLearningAgentBase<T>.TrainingSteps

ReinforcementLearningAgentBase<T>.Episodes

ReinforcementLearningAgentBase<T>.LossHistory

ReinforcementLearningAgentBase<T>.RewardHistory

ReinforcementLearningAgentBase<T>.Options

ReinforcementLearningAgentBase<T>.Predict(Vector<T>)

ReinforcementLearningAgentBase<T>.DefaultLossFunction

ReinforcementLearningAgentBase<T>.Train(Vector<T>, Vector<T>)

ReinforcementLearningAgentBase<T>.FeatureNames

ReinforcementLearningAgentBase<T>.GetFeatureImportance()

ReinforcementLearningAgentBase<T>.GetActiveFeatureIndices()

ReinforcementLearningAgentBase<T>.IsFeatureUsed(int)

ReinforcementLearningAgentBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

ReinforcementLearningAgentBase<T>.DeepCopy()

ReinforcementLearningAgentBase<T>.WithParameters(Vector<T>)

ReinforcementLearningAgentBase<T>.ComputeAverage(IEnumerable<T>)

ReinforcementLearningAgentBase<T>.SaveState(Stream)

ReinforcementLearningAgentBase<T>.LoadState(Stream)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

Rainbow combines: Double Q-learning, Dueling networks, Prioritized replay, Multi-step learning, Distributional RL (C51), and Noisy networks.

For Beginners: Rainbow takes the best ideas from six different DQN improvements and combines them. It's currently the strongest DQN variant, achieving state-of-the-art performance.

Six components:

Double Q-learning: Reduces overestimation
Dueling Architecture: Separates value and advantage
Prioritized Replay: Samples important experiences more
Multi-step Returns: Better credit assignment
Distributional RL (C51): Learns distribution of returns
Noisy Networks: Parameter noise for exploration

Famous for: DeepMind's combination achieving human-level Atari performance

Constructors

RainbowDQNAgent(RainbowDQNOptions<T>, IOptimizer<T, Vector<T>, Vector<T>>?)

public RainbowDQNAgent(RainbowDQNOptions<T> options, IOptimizer<T, Vector<T>, Vector<T>>? optimizer = null)

Parameters

options RainbowDQNOptions<T>
optimizer IOptimizer<T, Vector<T>, Vector<T>>

Properties

FeatureCount

Gets the number of input features (state dimensions).

public override int FeatureCount { get; }

Property Value

int

Methods

ApplyGradients(Vector<T>, T)

Applies gradients to update the agent.

public override void ApplyGradients(Vector<T> gradients, T learningRate)

Parameters

gradients Vector<T>
learningRate T

Clone()

Clones the agent.

public override IFullModel<T, Vector<T>, Vector<T>> Clone()

Returns

IFullModel<T, Vector<T>, Vector<T>>

ComputeGradients(Vector<T>, Vector<T>, ILossFunction<T>?)

Computes gradients for the agent.

public override Vector<T> ComputeGradients(Vector<T> input, Vector<T> target, ILossFunction<T>? lossFunction = null)

Parameters

input Vector<T>
target Vector<T>
lossFunction ILossFunction<T>

Returns

Vector<T>

Deserialize(byte[])

Deserializes the agent from bytes.

public override void Deserialize(byte[] data)

Parameters

data byte[]

GetMetrics()

Gets the current training metrics.

public override Dictionary<string, T> GetMetrics()

Returns

Dictionary<string, T>: Dictionary of metric names to values.

GetModelMetadata()

Gets model metadata.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

GetParameters()

Gets the agent's parameters.

public override Vector<T> GetParameters()

Returns

Vector<T>

LoadModel(string)

Loads the agent's state from a file.

public override void LoadModel(string filepath)

Parameters

filepath string: Path to load the agent from.

ResetEpisode()

Resets episode-specific state (if any).

public override void ResetEpisode()

SaveModel(string)

Saves the agent's state to a file.

public override void SaveModel(string filepath)

Parameters

filepath string: Path to save the agent.

SelectAction(Vector<T>, bool)

Selects an action given the current state observation.

public override Vector<T> SelectAction(Vector<T> state, bool training = true)

Parameters

state Vector<T>: The current state observation as a Vector.
training bool: Whether the agent is in training mode (affects exploration).

Returns

Vector<T>: Action as a Vector (can be discrete or continuous).

Serialize()

Serializes the agent to bytes.

public override byte[] Serialize()

Returns

byte[]

SetParameters(Vector<T>)

Sets the agent's parameters.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

StoreExperience(Vector<T>, Vector<T>, T, Vector<T>, bool)

Stores an experience tuple for later learning.

public override void StoreExperience(Vector<T> state, Vector<T> action, T reward, Vector<T> nextState, bool done)

Parameters

state Vector<T>: The state before action.
action Vector<T>: The action taken.
reward T: The reward received.
nextState Vector<T>: The state after action.
done bool: Whether the episode terminated.

Train()

Performs one training step, updating the agent's policy/value function.

public override T Train()

Returns

T: The training loss for monitoring.

Table of Contents

Class RainbowDQNAgent<T>

Type Parameters

Remarks

Constructors

RainbowDQNAgent(RainbowDQNOptions<T>, IOptimizer<T, Vector<T>, Vector<T>>?)

Parameters

Properties

FeatureCount

Property Value

Methods

ApplyGradients(Vector<T>, T)

Parameters

Clone()

Returns

ComputeGradients(Vector<T>, Vector<T>, ILossFunction<T>?)

Parameters

Returns

Deserialize(byte[])

Parameters

GetMetrics()

Returns

GetModelMetadata()

Returns

GetParameters()

Returns

LoadModel(string)

Parameters

ResetEpisode()

SaveModel(string)

Parameters

SelectAction(Vector<T>, bool)

Parameters

Returns

Serialize()

Returns

SetParameters(Vector<T>)

Parameters

StoreExperience(Vector<T>, Vector<T>, T, Vector<T>, bool)

Parameters

Train()

Returns