Class LSPIOptions<T>

Namespace: AiDotNet.Models.Options

Assembly: AiDotNet.dll

Configuration options for LSPI (Least-Squares Policy Iteration) agents.

public class LSPIOptions<T> : ReinforcementLearningOptions<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

ReinforcementLearningOptions<T>

LSPIOptions<T>

Inherited Members: ReinforcementLearningOptions<T>.LearningRate

ReinforcementLearningOptions<T>.DiscountFactor

ReinforcementLearningOptions<T>.LossFunction

ReinforcementLearningOptions<T>.Seed

ReinforcementLearningOptions<T>.BatchSize

ReinforcementLearningOptions<T>.ReplayBufferSize

ReinforcementLearningOptions<T>.TargetUpdateFrequency

ReinforcementLearningOptions<T>.UsePrioritizedReplay

ReinforcementLearningOptions<T>.EpsilonStart

ReinforcementLearningOptions<T>.EpsilonEnd

ReinforcementLearningOptions<T>.EpsilonDecay

ReinforcementLearningOptions<T>.WarmupSteps

ReinforcementLearningOptions<T>.MaxGradientNorm

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

LSPI combines least-squares methods with policy iteration. It alternates between policy evaluation (using LSTDQ) and policy improvement, iteratively refining the policy until convergence.

For Beginners: LSPI is like repeatedly asking "what's the best policy?" and "how good is it?" until the answers stop changing. Each iteration uses LSTD to evaluate the current policy, then improves it based on those evaluations.

Best for:

Batch reinforcement learning
Offline learning from fixed datasets
Sample-efficient policy learning
When you need guaranteed convergence

Not suitable for:

Online/streaming scenarios
Very large feature spaces
Continuous action spaces
Real-time learning requirements

Properties

ActionSize

Size of the action space (number of possible actions).

public int ActionSize { get; init; }

Property Value

int

ConvergenceThreshold

Weight change threshold for determining convergence.

public double ConvergenceThreshold { get; init; }

Property Value

double

FeatureSize

Number of features in the state representation.

public int FeatureSize { get; init; }

Property Value

int

MaxIterations

Maximum number of policy iteration steps before stopping.

public int MaxIterations { get; init; }

Property Value

int

RegularizationParam

Regularization parameter to prevent overfitting and ensure numerical stability.

public double RegularizationParam { get; init; }

Property Value

double

Table of Contents

Class LSPIOptions<T>

Type Parameters

Remarks

Properties

ActionSize

Property Value

ConvergenceThreshold

Property Value

FeatureSize

Property Value

MaxIterations

Property Value

RegularizationParam

Property Value