Class DecisionTreeNode<T>
- Namespace
- AiDotNet.LinearAlgebra
- Assembly
- AiDotNet.dll
Represents a node in a decision tree for machine learning algorithms.
public class DecisionTreeNode<T>
Type Parameters
TThe numeric data type used for calculations (e.g., float, double).
- Inheritance
-
DecisionTreeNode<T>
- Derived
- Inherited Members
Remarks
A decision tree is a flowchart-like structure where each internal node represents a decision based on a feature, each branch represents an outcome of that decision, and each leaf node represents a prediction or classification.
For Beginners: Think of a decision tree like a flowchart of questions. Starting at the top (root), each question (node) splits the data based on a feature (like "Is temperature > 70°F?"). Following the answers (branches) leads you to more questions or eventually to a final answer (leaf node). Decision trees are popular because they're easy to understand and visualize - they make decisions similar to how humans think.
Constructors
DecisionTreeNode()
Initializes a new instance of the DecisionTreeNode<T> class as a leaf node.
public DecisionTreeNode()
Remarks
For Beginners: This creates a new "final answer" node with default values. It's typically used when building a tree and creating new nodes that will be configured later.
DecisionTreeNode(int, T)
Initializes a new instance of the DecisionTreeNode<T> class as an internal decision node.
public DecisionTreeNode(int featureIndex, T splitValue)
Parameters
featureIndexintThe index of the feature to split on.
splitValueTThe value to use for splitting.
Remarks
For Beginners: This creates a new "question" node. You specify which feature to ask about (featureIndex) and what value to compare against (splitValue).
DecisionTreeNode(T)
Initializes a new instance of the DecisionTreeNode<T> class as a leaf node with a prediction.
public DecisionTreeNode(T prediction)
Parameters
predictionTThe prediction value for this leaf node.
Remarks
For Beginners: This creates a new "final answer" node with a specific prediction value. For example, if your tree is predicting house prices, this might create a leaf node that predicts "$250,000".
Properties
FeatureIndex
Gets or sets the index of the feature used for splitting at this node.
public int FeatureIndex { get; set; }
Property Value
Remarks
For Beginners: This is like asking "Which question should I ask at this point?" For example, if your data has features like [temperature, humidity, wind speed], a FeatureIndex of 0 means this node is making a decision based on temperature.
IsLeaf
Gets or sets a value indicating whether this node is a leaf node (has no children).
public bool IsLeaf { get; set; }
Property Value
Remarks
For Beginners: A leaf node is a final answer, not another question. If IsLeaf is true, this node doesn't lead to more questions - it provides a prediction directly.
Left
Gets or sets the left child node (typically represents the "less than" or "no" branch).
public DecisionTreeNode<T>? Left { get; set; }
Property Value
Remarks
For Beginners: If the answer to the node's question is "no" or "less than" (e.g., "Is temperature > 70°F?" "No"), the decision tree follows this path to the next question or answer.
LeftSampleCount
Gets or sets the number of samples that went to the left child after splitting.
public int LeftSampleCount { get; set; }
Property Value
Remarks
For Beginners: This counts how many examples answered "no" or "less than" to this node's question and followed the left path.
LinearModel
Gets or sets the linear regression model for this node (used in some advanced tree variants).
public SimpleRegression<T>? LinearModel { get; set; }
Property Value
Remarks
For Beginners: Some advanced decision trees (like model trees) use a simple linear equation at the leaves instead of a single prediction value. This property holds that equation if used. Think of it as making a more nuanced prediction based on a formula rather than a single value.
Prediction
Gets or sets the prediction value for this node when it's a leaf node.
public T Prediction { get; set; }
Property Value
- T
Remarks
For Beginners: If this node is a final answer (leaf node), this is the actual prediction. For example, in a tree predicting house prices, this might be "$250,000".
Predictions
Gets or sets the vector of predictions for samples at this node.
public Vector<T>? Predictions { get; set; }
Property Value
- Vector<T>
Remarks
For Beginners: This is a list of predictions made for each sample that reached this node. It's used during training to evaluate how well the node is performing.
Right
Gets or sets the right child node (typically represents the "greater than or equal to" or "yes" branch).
public DecisionTreeNode<T>? Right { get; set; }
Property Value
Remarks
For Beginners: If the answer to the node's question is "yes" or "greater than or equal to" (e.g., "Is temperature > 70°F?" "Yes"), the decision tree follows this path to the next question or answer.
RightSampleCount
Gets or sets the number of samples that went to the right child after splitting.
public int RightSampleCount { get; set; }
Property Value
Remarks
For Beginners: This counts how many examples answered "yes" or "greater than or equal to" to this node's question and followed the right path.
SampleValues
Gets or sets the list of target values from the samples at this node.
public List<T> SampleValues { get; set; }
Property Value
- List<T>
Remarks
For Beginners: These are the actual "answers" or "outcomes" from your training examples that reached this node. For instance, if predicting house prices, these would be the actual prices of houses in your training data that matched all the conditions to reach this node.
Samples
Gets or sets the list of data samples that reached this node during training.
public List<Sample<T>> Samples { get; set; }
Property Value
Remarks
For Beginners: These are the examples from your training data that ended up at this node after answering all the previous questions. The decision tree uses these samples to make decisions about how to structure itself or what prediction to make.
SplitValue
Gets or sets the value used to split the data at this node.
public T SplitValue { get; set; }
Property Value
- T
Remarks
For Beginners: This is the actual value that was found to be the best point to split the data. While similar to Threshold, SplitValue is often the specific value from the dataset that was chosen as the optimal splitting point during tree construction.
SumSquaredError
Gets or sets the sum of squared errors for predictions at this node.
public T SumSquaredError { get; set; }
Property Value
- T
Remarks
For Beginners: This measures how accurate the node's predictions are. It calculates the difference between each prediction and the actual value, squares these differences (to make them positive), and adds them up. A smaller value means better predictions.
Threshold
Gets or sets the threshold value used to determine the split direction.
public T Threshold { get; set; }
Property Value
- T
Remarks
For Beginners: This is the specific value used in the question. For example, if FeatureIndex refers to temperature, Threshold might be 70°F, so the question becomes "Is temperature > 70°F?"
Methods
UpdateNodeStatistics()
Updates statistical information for this node based on its samples.
public void UpdateNodeStatistics()
Remarks
For Beginners: This method recalculates how well this node is performing by measuring the error between its predictions and the actual values. It's typically called after the node has been modified or when samples have been added or removed.