Class DoubleDQNAgent<T>
- Namespace
- AiDotNet.ReinforcementLearning.Agents.DoubleDQN
- Assembly
- AiDotNet.dll
Double Deep Q-Network (Double DQN) agent for reinforcement learning.
public class DoubleDQNAgent<T> : DeepReinforcementLearningAgentBase<T>, IRLAgent<T>, IFullModel<T, Vector<T>, Vector<T>>, IModel<Vector<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Vector<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Vector<T>, Vector<T>>>, IGradientComputable<T, Vector<T>, Vector<T>>, IJitCompilable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
DoubleDQNAgent<T>
- Implements
-
IRLAgent<T>
- Inherited Members
- Extension Methods
Remarks
Double DQN addresses the overestimation bias in standard DQN by decoupling action selection from action evaluation. It uses the online network to select actions and the target network to evaluate them, leading to more accurate Q-value estimates.
For Beginners: Standard DQN tends to overestimate Q-values because it uses the same network to both select and evaluate actions (max operator causes positive bias).
Double DQN fixes this by:
- Using online network to SELECT the best action
- Using target network to EVALUATE that action's value
Think of it like getting a second opinion: one expert picks what looks best, another expert judges its actual value. This reduces overoptimistic estimates.
Key Improvement: More stable learning, better performance, especially when there's noise or stochasticity in the environment.
Reference: van Hasselt et al., "Deep Reinforcement Learning with Double Q-learning", 2015.
Constructors
DoubleDQNAgent(DoubleDQNOptions<T>)
Initializes a new instance of the DoubleDQNAgent class.
public DoubleDQNAgent(DoubleDQNOptions<T> options)
Parameters
optionsDoubleDQNOptions<T>Configuration options for the Double DQN agent.
Properties
FeatureCount
Gets the number of input features (state dimensions).
public override int FeatureCount { get; }
Property Value
Methods
ApplyGradients(Vector<T>, T)
Not supported for DoubleDQNAgent. Use the agent's internal Train() loop instead.
public override void ApplyGradients(Vector<T> gradients, T learningRate)
Parameters
gradientsVector<T>Not used.
learningRateTNot used.
Exceptions
- NotSupportedException
Always thrown. DoubleDQN manages gradient computation and parameter updates internally through backpropagation.
Clone()
Clones the agent.
public override IFullModel<T, Vector<T>, Vector<T>> Clone()
Returns
- IFullModel<T, Vector<T>, Vector<T>>
ComputeGradients(Vector<T>, Vector<T>, ILossFunction<T>?)
Computes gradients for the agent.
public override Vector<T> ComputeGradients(Vector<T> input, Vector<T> target, ILossFunction<T>? lossFunction = null)
Parameters
inputVector<T>targetVector<T>lossFunctionILossFunction<T>
Returns
- Vector<T>
Deserialize(byte[])
Deserializes the agent from bytes.
public override void Deserialize(byte[] data)
Parameters
databyte[]
GetMetrics()
Gets the current training metrics.
public override Dictionary<string, T> GetMetrics()
Returns
- Dictionary<string, T>
Dictionary of metric names to values.
GetModelMetadata()
Gets model metadata.
public override ModelMetadata<T> GetModelMetadata()
Returns
GetParameters()
Gets the agent's parameters.
public override Vector<T> GetParameters()
Returns
- Vector<T>
LoadModel(string)
Loads the agent's state from a file.
public override void LoadModel(string filepath)
Parameters
filepathstringPath to load the agent from.
SaveModel(string)
Saves the agent's state to a file.
public override void SaveModel(string filepath)
Parameters
filepathstringPath to save the agent.
SelectAction(Vector<T>, bool)
Selects an action given the current state observation.
public override Vector<T> SelectAction(Vector<T> state, bool training = true)
Parameters
stateVector<T>The current state observation as a Vector.
trainingboolWhether the agent is in training mode (affects exploration).
Returns
- Vector<T>
Action as a Vector (can be discrete or continuous).
Serialize()
Serializes the agent to bytes.
public override byte[] Serialize()
Returns
- byte[]
SetParameters(Vector<T>)
Sets the agent's parameters.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>
StoreExperience(Vector<T>, Vector<T>, T, Vector<T>, bool)
Stores an experience tuple for later learning.
public override void StoreExperience(Vector<T> state, Vector<T> action, T reward, Vector<T> nextState, bool done)
Parameters
stateVector<T>The state before action.
actionVector<T>The action taken.
rewardTThe reward received.
nextStateVector<T>The state after action.
doneboolWhether the episode terminated.
Train()
Performs one training step, updating the agent's policy/value function.
public override T Train()
Returns
- T
The training loss for monitoring.