Class LSTDOptions<T>
Configuration options for LSTD (Least-Squares Temporal Difference) agents.
public class LSTDOptions<T> : ReinforcementLearningOptions<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
LSTDOptions<T>
- Inherited Members
Remarks
LSTD solves for the optimal linear weights directly using matrix operations (A^-1 * b) rather than incremental updates. This provides more sample-efficient learning but requires solving a linear system.
For Beginners: LSTD is like solving a math equation directly instead of guessing and checking. It collects experiences and then computes the best weights all at once using linear algebra, rather than slowly adjusting them one step at a time.
Best for:
- Limited data scenarios (sample efficient)
- Batch learning from fixed datasets
- When you have computational power for matrix operations
- Problems where convergence speed matters
Not suitable for:
- Very large feature spaces (matrix becomes huge)
- Online learning (needs batches)
- When computational resources are limited
- Non-linear function approximation needs
Properties
ActionSize
Size of the action space (number of possible actions).
public int ActionSize { get; init; }
Property Value
FeatureSize
Number of features in the state representation.
public int FeatureSize { get; init; }
Property Value
RegularizationParam
Regularization parameter to prevent overfitting and ensure numerical stability.
public double RegularizationParam { get; init; }