Class MuZeroOptions<T>

Namespace: AiDotNet.Models.Options

Assembly: AiDotNet.dll

Configuration options for MuZero agents.

public class MuZeroOptions<T> : ReinforcementLearningOptions<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

ReinforcementLearningOptions<T>

MuZeroOptions<T>

Inherited Members: ReinforcementLearningOptions<T>.LearningRate

ReinforcementLearningOptions<T>.DiscountFactor

ReinforcementLearningOptions<T>.LossFunction

ReinforcementLearningOptions<T>.Seed

ReinforcementLearningOptions<T>.BatchSize

ReinforcementLearningOptions<T>.ReplayBufferSize

ReinforcementLearningOptions<T>.TargetUpdateFrequency

ReinforcementLearningOptions<T>.UsePrioritizedReplay

ReinforcementLearningOptions<T>.EpsilonStart

ReinforcementLearningOptions<T>.EpsilonEnd

ReinforcementLearningOptions<T>.EpsilonDecay

ReinforcementLearningOptions<T>.WarmupSteps

ReinforcementLearningOptions<T>.MaxGradientNorm

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

MuZero combines tree search (like AlphaZero) with learned models. It learns dynamics, rewards, and values without knowing environment rules.

For Beginners: MuZero is DeepMind's breakthrough that mastered Atari, Go, Chess, and Shogi without being told the rules. It learns its own "internal model" of the game and uses tree search to plan ahead.

Key innovations:

Learned Model: No need for game rules, learns environment dynamics
MCTS: Uses Monte Carlo Tree Search for planning
Three Networks: Representation, dynamics, and prediction
Planning: Searches through imagined futures

Think of it like: Learning to play chess by watching games, figuring out the rules yourself, then planning moves by mentally simulating the game.

Famous for: Superhuman performance across Atari, board games, without rules

Properties

ActionSize

public int ActionSize { get; init; }

Property Value

int

DynamicsLayers

public List<int> DynamicsLayers { get; init; }

Property Value

List<int>

LatentStateSize

public int LatentStateSize { get; init; }

Property Value

int

NumSimulations

public int NumSimulations { get; init; }

Property Value

int

ObservationSize

public int ObservationSize { get; init; }

Property Value

int

Optimizer

The optimizer used for updating network parameters. If null, Adam optimizer will be used by default.

public IOptimizer<T, Vector<T>, Vector<T>>? Optimizer { get; init; }

Property Value

IOptimizer<T, Vector<T>, Vector<T>>

PUCTConstant

public double PUCTConstant { get; init; }

Property Value

double

PredictionLayers

public List<int> PredictionLayers { get; init; }

Property Value

List<int>

PriorityAlpha

public double PriorityAlpha { get; init; }

Property Value

double

RepresentationLayers

public List<int> RepresentationLayers { get; init; }

Property Value

List<int>

RootDirichletAlpha

public double RootDirichletAlpha { get; init; }

Property Value

double

RootExplorationFraction

public double RootExplorationFraction { get; init; }

Property Value

double

TDSteps

public int TDSteps { get; init; }

Property Value

int

UnrollSteps

public int UnrollSteps { get; init; }

Property Value

int

UseValuePrefix

public bool UseValuePrefix { get; init; }

Property Value

bool

Table of Contents

Class MuZeroOptions<T>

Type Parameters

Remarks

Properties

ActionSize

Property Value

DynamicsLayers

Property Value

LatentStateSize

Property Value

NumSimulations

Property Value

ObservationSize

Property Value

Optimizer

Property Value

PUCTConstant

Property Value

PredictionLayers

Property Value

PriorityAlpha

Property Value

RepresentationLayers

Property Value

RootDirichletAlpha

Property Value

RootExplorationFraction

Property Value

TDSteps

Property Value

UnrollSteps

Property Value

UseValuePrefix

Property Value