Table of Contents

Class DecisionTreeClassifierOptions<T>

Namespace
AiDotNet.Models.Options
Assembly
AiDotNet.dll

Configuration options for decision tree classifiers.

public class DecisionTreeClassifierOptions<T> : ClassifierOptions<T>

Type Parameters

T

The data type used for calculations.

Inheritance
DecisionTreeClassifierOptions<T>
Inherited Members

Remarks

Decision trees are supervised learning algorithms that learn a hierarchy of if/else rules from training data. They are easy to interpret and can handle both numerical and categorical features.

For Beginners: Decision trees are like a game of 20 questions.

At each step, the tree asks a question about a feature: "Is age > 30?" -> Yes: "Is income > 50000?" -> No: "Deny loan" -> No: "Is student?" -> Yes: "Approve loan"

Key settings:

  • MaxDepth: Limits how many questions deep the tree can go
  • MinSamplesSplit: Minimum samples needed to continue splitting
  • MaxFeatures: How many features to consider at each split

Properties

Criterion

Gets or sets the criterion used to measure the quality of a split.

public ClassificationSplitCriterion Criterion { get; set; }

Property Value

ClassificationSplitCriterion

The split criterion. Default is Gini impurity.

Remarks

The criterion determines how the tree evaluates potential splits. Gini impurity and entropy (information gain) are common choices.

For Beginners: This determines how the tree decides which question to ask.

  • Gini: Measures how "pure" the groups are after a split
  • Entropy: Measures information gain from a split

Both work well in practice; Gini is slightly faster to compute.

MaxDepth

Gets or sets the maximum depth of the tree.

public int? MaxDepth { get; set; }

Property Value

int?

The maximum depth, or null for unlimited depth. Default is null.

Remarks

Limiting tree depth prevents overfitting by restricting the complexity of the learned model. Deeper trees can capture more complex patterns but are more prone to overfitting.

For Beginners: MaxDepth limits how many decisions the tree can make.

  • MaxDepth = 2: Tree asks at most 2 questions before deciding
  • MaxDepth = 10: Tree can ask up to 10 questions
  • MaxDepth = null: No limit (careful - can lead to overfitting!)

Start with a smaller depth (3-5) and increase if needed.

MaxFeatures

Gets or sets the number of features to consider when looking for the best split.

public int? MaxFeatures { get; set; }

Property Value

int?

The number of features, or null to use all features. Default is null.

Remarks

Using a subset of features at each split can improve generalization and reduce training time for high-dimensional datasets.

For Beginners: At each decision point, the tree considers this many features.

  • null: Consider all features (default)
  • sqrt(n_features): Common for random forests
  • A specific number: Limits the features considered

Using fewer features speeds up training and can prevent overfitting.

MinImpurityDecrease

Gets or sets the minimum impurity decrease required for a split.

public double MinImpurityDecrease { get; set; }

Property Value

double

The minimum impurity decrease. Default is 0.0.

Remarks

A node will only be split if the split induces a decrease in impurity greater than or equal to this value. This can be used for pre-pruning.

For Beginners: Only split if it significantly improves the classification.

A higher value (e.g., 0.01) means the tree will only split when it really helps, resulting in a simpler tree that generalizes better.

MinSamplesLeaf

Gets or sets the minimum number of samples required at a leaf node.

public int MinSamplesLeaf { get; set; }

Property Value

int

The minimum number of samples at leaf nodes. Default is 1.

Remarks

This parameter ensures that each leaf node represents at least this many training samples, which helps prevent overfitting to individual samples.

For Beginners: Each final decision (leaf) must apply to at least this many examples.

If MinSamplesLeaf = 5, every leaf must have at least 5 training samples. This prevents the tree from creating rules for individual outliers.

MinSamplesSplit

Gets or sets the minimum number of samples required to split an internal node.

public int MinSamplesSplit { get; set; }

Property Value

int

The minimum number of samples. Default is 2.

Remarks

Increasing this value prevents the tree from learning patterns specific to very small groups of samples, which helps prevent overfitting.

For Beginners: This prevents the tree from making decisions based on too few examples.

If MinSamplesSplit = 10, the tree won't split a node unless it has at least 10 samples. This makes the tree more robust and less likely to memorize noise.

RandomState

Gets or sets the random state for reproducibility.

public int? RandomState { get; set; }

Property Value

int?

The random seed, or null for non-deterministic behavior. Default is null.

Remarks

When MaxFeatures is set or when there are ties in split decisions, randomness is used. Setting this value ensures reproducible results.