Class ExpressionTree<T, TInput, TOutput>
- Namespace
- AiDotNet.LinearAlgebra
- Assembly
- AiDotNet.dll
Represents a symbolic expression tree for mathematical operations that can be used for symbolic regression.
public class ExpressionTree<T, TInput, TOutput> : IFullModel<T, TInput, TOutput>, IModel<TInput, TOutput, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, TInput, TOutput>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, TInput, TOutput>>, IGradientComputable<T, TInput, TOutput>, IJitCompilable<T>
Type Parameters
TThe numeric type used in the expression tree (e.g., double, float).
TInputTOutput
- Inheritance
-
ExpressionTree<T, TInput, TOutput>
- Implements
-
IFullModel<T, TInput, TOutput>IModel<TInput, TOutput, ModelMetadata<T>>IParameterizable<T, TInput, TOutput>ICloneable<IFullModel<T, TInput, TOutput>>IGradientComputable<T, TInput, TOutput>
- Inherited Members
- Extension Methods
Remarks
For Beginners: An ExpressionTree is like a mathematical formula represented as a tree structure. Each node in the tree is either a number, a variable, or an operation (like addition or multiplication). This allows the AI to create and evolve mathematical formulas that can model your data.
Constructors
ExpressionTree(ExpressionNodeType, T?, ExpressionTree<T, TInput, TOutput>?, ExpressionTree<T, TInput, TOutput>?, ILossFunction<T>?)
Creates a new expression tree node with the specified properties.
public ExpressionTree(ExpressionNodeType type, T? value = default, ExpressionTree<T, TInput, TOutput>? left = null, ExpressionTree<T, TInput, TOutput>? right = null, ILossFunction<T>? lossFunction = null)
Parameters
typeExpressionNodeTypeThe type of node to create.
valueTThe value for this node (for constants and variables).
leftExpressionTree<T, TInput, TOutput>The left child node.
rightExpressionTree<T, TInput, TOutput>The right child node.
lossFunctionILossFunction<T>Optional loss function to use for training. If null, uses Mean Squared Error (MSE) for symbolic regression.
Remarks
For Beginners: This creates a new part of your mathematical formula. You can create simple nodes (like numbers or variables) or operation nodes (like addition or multiplication) that connect to other nodes.
Properties
Coefficients
Gets a vector containing all coefficient values in this expression tree.
public Vector<T> Coefficients { get; }
Property Value
- Vector<T>
Remarks
For Beginners: This collects all the constant numbers from your formula into a list. For example, if your formula is "2x + 3y + 5", this would give you [2, 3, 5]. These numbers are called "coefficients" and are important when optimizing your model.
Complexity
Gets the complexity of this expression tree, measured as the total number of nodes.
public int Complexity { get; }
Property Value
Remarks
For Beginners: Complexity tells you how complicated the formula is. A higher number means a more complex formula with more terms and operations.
DefaultLossFunction
Gets the default loss function used by this model for gradient computation.
public ILossFunction<T> DefaultLossFunction { get; }
Property Value
Remarks
For ExpressionTree (symbolic regression), the default loss function is Mean Squared Error (MSE), which is the standard loss function for regression problems.
FeatureCount
Gets the number of features (variables) used in this expression tree.
public int FeatureCount { get; }
Property Value
Remarks
For Beginners: This tells you how many different input variables your formula uses. For example, if your formula uses x[0], x[1], and x[2], the feature count would be 3.
Id
Gets the unique identifier for this node.
public int Id { get; }
Property Value
Remarks
For Beginners: This is a unique number assigned to each part of your formula, making it easy to identify and reference specific parts of the expression tree.
Left
Gets the left child node of this node.
public ExpressionTree<T, TInput, TOutput>? Left { get; }
Property Value
- ExpressionTree<T, TInput, TOutput>
Remarks
For Beginners: In operations like addition (a + b), the left child represents 'a'.
ParameterCount
Gets the number of parameters (constant nodes) in this expression tree.
public virtual int ParameterCount { get; }
Property Value
Remarks
For Beginners: This tells you how many constant values are in your formula. For example, if your formula is "2x + 3y + 5", there are 3 parameters: 2, 3, and 5. This value is obtained from the Coefficients property, which returns a vector of all constant values.
Parent
Gets the parent node of this node.
public ExpressionTree<T, TInput, TOutput>? Parent { get; }
Property Value
- ExpressionTree<T, TInput, TOutput>
RequiredFeatureCount
Gets the minimum number of features required for input data to this expression tree.
public int RequiredFeatureCount { get; }
Property Value
Remarks
For Beginners: This tells you the minimum number of columns your input data must have. It equals the maximum variable index used plus one. For example, if your formula uses x[5], the required feature count is 6 (indices 0 through 5).
Note: This is different from FeatureCount which counts unique variables used. A tree using only x[5] has FeatureCount=1 but RequiredFeatureCount=6.
Right
Gets the right child node of this node.
public ExpressionTree<T, TInput, TOutput>? Right { get; }
Property Value
- ExpressionTree<T, TInput, TOutput>
Remarks
For Beginners: In operations like addition (a + b), the right child represents 'b'.
SupportsJitCompilation
Gets whether this expression tree supports JIT compilation.
public bool SupportsJitCompilation { get; }
Property Value
- bool
True - expression trees are inherently computation graphs and support JIT compilation.
Remarks
Expression trees are already symbolic computation graphs, making them ideal for JIT compilation. The tree structure directly represents the mathematical operations to be performed, which can be compiled into optimized native code.
For Beginners: Expression trees are like ready-made recipes for JIT compilation.
Since an expression tree already describes your formula as a series of operations (add, multiply, etc.), the JIT compiler can:
- Convert it directly to fast machine code
- Optimize common patterns (e.g., constant folding)
- Inline operations for better performance
This provides 2-5x speedup for complex symbolic expressions.
Type
Gets the type of this node (constant, variable, or operation).
public ExpressionNodeType Type { get; }
Property Value
Value
Gets the value stored in this node. For constants, this is the actual value. For variables, this is the index of the variable in the input vector.
public T Value { get; }
Property Value
- T
Methods
ApplyGradients(Vector<T>, T)
Applies pre-computed gradients to update the model parameters (constants in the expression tree).
public void ApplyGradients(Vector<T> gradients, T learningRate)
Parameters
gradientsVector<T>The gradient vector to apply.
learningRateTThe learning rate for the update.
Remarks
Updates constants using: constant = constant - learningRate * gradient
For Beginners: After computing gradients (seeing which direction to adjust each constant), this method actually adjusts them. The learning rate controls how big of an adjustment to make.
Exceptions
- ArgumentNullException
If gradients is null.
- ArgumentException
If gradient vector length doesn't match parameter count.
Clone()
Creates a clone of this expression tree.
public IFullModel<T, TInput, TOutput> Clone()
Returns
- IFullModel<T, TInput, TOutput>
A new expression tree with the same structure and values.
Remarks
For Beginners: This creates an exact duplicate of your formula. It's essentially the same as DeepCopy and Copy - it makes a complete duplicate that you can modify without changing the original.
ComputeGradients(TInput, TOutput, ILossFunction<T>?)
Computes gradients of the loss function with respect to model parameters WITHOUT updating parameters.
public Vector<T> ComputeGradients(TInput input, TOutput target, ILossFunction<T>? lossFunction = null)
Parameters
inputTInputThe input data.
targetTOutputThe target/expected output.
lossFunctionILossFunction<T>The loss function to use. If null, uses the model's default loss function.
Returns
- Vector<T>
A vector containing gradients with respect to all model parameters (constants in the expression tree).
Remarks
This method computes gradients using numerical differentiation (finite differences). For each constant in the expression tree, it slightly perturbs the value and measures how the loss changes, approximating the gradient.
For Beginners: This calculates how to adjust each constant in your mathematical formula to reduce error. Since expression trees are symbolic, we use a numerical approximation: we slightly change each constant and see how much the error changes.
Exceptions
- ArgumentNullException
If input or target is null.
Copy()
Creates a deep copy of this expression tree.
public IFullModel<T, TInput, TOutput> Copy()
Returns
- IFullModel<T, TInput, TOutput>
A new expression tree with the same structure and values as this one.
Remarks
For Beginners: This creates an exact duplicate of the formula, like making a photocopy. This is important because we often need to make changes to a formula without modifying the original.
Crossover(IFullModel<T, TInput, TOutput>, double)
Combines this expression tree with another to create a new "offspring" expression tree.
public IFullModel<T, TInput, TOutput> Crossover(IFullModel<T, TInput, TOutput> other, double crossoverRate)
Parameters
otherIFullModel<T, TInput, TOutput>The other expression tree to combine with.
crossoverRatedoubleThe probability (0.0 to 1.0) that crossover will occur.
Returns
- IFullModel<T, TInput, TOutput>
A new expression tree that combines parts from both parent trees.
Remarks
For Beginners: Crossover is like taking parts from two different formulas and combining them to create a new formula. For example, if one formula is (x + 2) and another is (y * 3), crossover might create (x * 3) by taking parts from each. This mimics how genetic traits are passed from parents to children in nature.
DeepCopy()
Creates a deep copy of this expression tree.
public IFullModel<T, TInput, TOutput> DeepCopy()
Returns
- IFullModel<T, TInput, TOutput>
A new, identical expression tree.
Remarks
For Beginners: This creates an exact duplicate of the entire formula tree. Unlike the Copy method which returns a general IFullModel, this method returns a specific ExpressionTree. This is useful when you need to make changes to a copy without affecting the original formula.
Deserialize(byte[])
Loads an expression tree from a byte array, replacing the current tree's structure.
public void Deserialize(byte[] data)
Parameters
databyte[]The byte array containing the serialized expression tree.
Remarks
For Beginners: This loads a previously saved formula from a compact format and replaces the current formula with it. It's like opening a saved file and loading its contents.
Deserialize(BinaryReader)
Deserializes an expression tree from a binary reader.
public ExpressionTree<T, TInput, TOutput> Deserialize(BinaryReader reader)
Parameters
readerBinaryReaderThe binary reader containing the serialized tree data.
Returns
- ExpressionTree<T, TInput, TOutput>
A new ExpressionTree instance created from the serialized data.
Remarks
For Beginners: This method reads a saved expression tree from binary data and reconstructs it. Think of it like opening a saved file that contains your mathematical formula.
Evaluate(Vector<T>)
Evaluates this expression tree for a given input vector.
public T Evaluate(Vector<T> input)
Parameters
inputVector<T>The input vector containing values for variables.
Returns
- T
The result of evaluating the expression.
Remarks
For Beginners: This calculates the result of your formula for a specific set of input values. For example, if your formula is "2x[0] + x[1]" and your input is [3, 4], the result would be 23 + 4 = 10.
ExportComputationGraph(List<ComputationNode<T>>)
Exports the expression tree as a computation graph for JIT compilation.
public ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes (variables and constants).
Returns
- ComputationNode<T>
The root computation node representing the complete expression.
Remarks
This method converts the expression tree into a computation graph by: 1. Creating variable nodes for each unique variable in the tree 2. Recursively building the computation graph from the tree structure 3. Adding all input nodes (variables) to the inputNodes list
For Beginners: This converts your symbolic formula into a computation graph.
For example, the expression tree representing "(x[0] * 2) + x[1]" becomes:
- Variable node for x[0]
- Constant node for 2
- Multiply node connecting them
- Variable node for x[1]
- Add node combining the multiply result with x[1]
The JIT compiler then optimizes this graph and generates fast code.
Note: Only variables are added to inputNodes. Constants are embedded in the graph.
Exceptions
- ArgumentNullException
Thrown when inputNodes is null.
FindNodeById(int)
Finds a node in the tree by its unique identifier.
public ExpressionTree<T, TInput, TOutput>? FindNodeById(int id)
Parameters
idintThe unique identifier of the node to find.
Returns
- ExpressionTree<T, TInput, TOutput>
The node with the specified ID, or null if no such node exists.
Remarks
For Beginners: Every part of your formula has a unique ID number. This method helps you find a specific part by its ID, like finding a person by their social security number.
Fit(Matrix<T>, Vector<T>)
Fits the expression tree to the provided training data.
public void Fit(Matrix<T> X, Vector<T> y)
Parameters
XMatrix<T>The input features matrix.
yVector<T>The target values vector.
Remarks
For Beginners: For expression trees, "fitting" just checks if the formula can work with your data. Unlike other AI models, the formula itself doesn't change during fitting - it's predefined by the tree structure.
GetActiveFeatureIndices()
Gets the indices of all features (variables) used in this expression tree.
public IEnumerable<int> GetActiveFeatureIndices()
Returns
- IEnumerable<int>
A collection of feature indices.
Remarks
For Beginners: This tells you which input variables are actually used in your formula. For example, if your formula only uses x[0] and x[2], this returns [0, 2], showing that the formula uses the first and third variables but not the second one.
GetAllNodes()
Gets a list of all nodes in this expression tree.
public List<ExpressionTree<T, TInput, TOutput>> GetAllNodes()
Returns
- List<ExpressionTree<T, TInput, TOutput>>
A list containing all nodes in the tree.
Remarks
For Beginners: This collects all the parts of your formula into a list. For example, if your formula is (x + 2) * y, this would give you a list containing: the multiplication operation, the addition operation, the x variable, the constant 2, and the y variable.
GetFeatureImportance()
Gets the feature importance scores for this expression tree.
public virtual Dictionary<string, T> GetFeatureImportance()
Returns
- Dictionary<string, T>
A dictionary mapping feature names to importance scores.
Remarks
For Beginners: Feature importance tells you which input variables matter most in your formula. For expression trees, importance is calculated by counting how many times each variable appears in the formula. Variables that appear more frequently are considered more important.
GetModelMetadata()
Gets metadata about this expression tree model.
public ModelMetadata<T> GetModelMetadata()
Returns
- ModelMetadata<T>
A ModelMetadata object containing information about this model.
Remarks
For Beginners: This provides useful information about your formula, like how complex it is and how many input variables it needs. Think of it as a summary sheet about your mathematical model.
GetParameters()
Gets the parameters of this expression tree.
public Vector<T> GetParameters()
Returns
- Vector<T>
A vector containing all coefficient values in this expression tree.
Remarks
For Beginners: This returns all the constant numbers from your formula. For example, if your formula is "2x + 3y + 5", this would give you [2, 3, 5]. These numbers are the adjustable parameters that can be tuned to improve predictions.
IsFeatureUsed(int)
Checks if a specific feature (variable) is used in this expression tree.
public bool IsFeatureUsed(int featureIndex)
Parameters
featureIndexintThe index of the feature to check.
Returns
- bool
True if the feature is used, false otherwise.
Remarks
For Beginners: This checks if your formula uses a specific input variable. For example, if featureIndex is 2, it checks if x[2] appears anywhere in your formula.
LoadModel(string)
Loads an expression tree model from a file.
public virtual void LoadModel(string filePath)
Parameters
filePathstringThe path to the file containing the saved model.
Remarks
For Beginners: This loads a previously saved formula from a file, allowing you to reuse it without recreating it. The loaded formula can immediately be used for predictions.
LoadState(Stream)
Loads the expression tree's state (structure and values) from a stream.
public virtual void LoadState(Stream stream)
Parameters
streamStreamThe stream to read the expression tree state from.
Remarks
This method deserializes expression tree state that was previously saved with SaveState, restoring the complete tree structure, node types, values, and connections. It uses the existing Deserialize method after reading data from the stream.
For Beginners: This is like loading a saved snapshot of your mathematical formula.
When you call LoadState:
- The tree structure is read from the stream
- All node types and values are restored
- The formula becomes identical to when SaveState was called
After loading, the formula can:
- Make predictions using the restored structure
- Continue evolving during optimization
- Be used for symbolic regression or genetic programming
This is essential for:
- Resuming interrupted evolutionary training
- Loading the best formula after optimization
- Deploying symbolic models to production
- Knowledge distillation with interpretable models
Exceptions
- ArgumentNullException
Thrown when stream is null.
- IOException
Thrown when there's an error reading from the stream.
- InvalidOperationException
Thrown when the stream contains invalid or incompatible data.
Mutate(double)
Creates a modified version of this expression tree by applying random mutations.
public IFullModel<T, TInput, TOutput> Mutate(double mutationRate)
Parameters
mutationRatedoubleThe probability (0.0 to 1.0) that a mutation will occur at each node.
Returns
- IFullModel<T, TInput, TOutput>
A new expression tree with mutations applied.
Remarks
For Beginners: Mutation is like making small random changes to a formula to see if it improves. For example, changing a "+" to a "*" or changing a constant from 2.5 to 3.1. This is inspired by how genetic mutations work in nature and helps the AI explore different solutions.
Predict(Matrix<T>)
Makes predictions using this expression tree for multiple input samples.
public Vector<T> Predict(Matrix<T> input)
Parameters
inputMatrix<T>A matrix where each row represents a sample and each column represents a feature.
Returns
- Vector<T>
A vector containing the predicted values for each input sample.
Remarks
For Beginners: This method takes your data (like height, weight, age values) and runs each row through the mathematical formula represented by this tree to get predictions. For example, if your tree represents "2x + y", and your input has values [3,4], the prediction would be 2*3 + 4 = 10.
Note: If the input has more features than the model requires, the extra features are allowed but ignored. Only the features up to RequiredFeatureCount are used in predictions. This flexibility supports transfer learning scenarios where input data may contain additional features not used by this particular model.
Exceptions
- ArgumentException
Thrown when the input matrix has incorrect dimensions.
Predict(TInput)
Makes a prediction for an input example.
public TOutput Predict(TInput input)
Parameters
inputTInputThe input data (Vector, Matrix, or Tensor).
Returns
- TOutput
The predicted output.
Remarks
For Beginners: This method applies your mathematical formula to the input data to calculate a prediction. It handles different types of inputs (vectors, matrices, or tensors).
SaveModel(string)
Saves the expression tree model to a file.
public virtual void SaveModel(string filePath)
Parameters
filePathstringThe path where the model should be saved.
Remarks
For Beginners: This saves your mathematical formula to a file so you can load it later without having to recreate it. The file contains the tree structure, all node types, and values.
SaveState(Stream)
Saves the expression tree's current state (structure and values) to a stream.
public virtual void SaveState(Stream stream)
Parameters
streamStreamThe stream to write the expression tree state to.
Remarks
This method serializes the complete expression tree structure, including all node types, values, and connections. It uses the existing Serialize method and writes the data to the provided stream.
For Beginners: This is like creating a snapshot of your mathematical formula.
When you call SaveState:
- The entire tree structure is written to the stream
- All node types (constants, variables, operations) are preserved
- All values and connections are saved
This is particularly useful for:
- Checkpointing during evolutionary algorithm training
- Knowledge distillation with symbolic models
- Saving the best formula found during optimization
- Creating formula ensembles
You can later use LoadState to restore the formula to this exact state.
Exceptions
- ArgumentNullException
Thrown when stream is null.
- IOException
Thrown when there's an error writing to the stream.
Serialize()
Converts this expression tree to a byte array for storage or transmission.
public byte[] Serialize()
Returns
- byte[]
A byte array representing the serialized expression tree.
Remarks
For Beginners: This converts your mathematical formula into a compact format that can be saved to a file or sent over the internet. It's like zipping up your formula for storage.
Serialize(BinaryWriter)
Writes this expression tree to a binary stream.
public void Serialize(BinaryWriter writer)
Parameters
writerBinaryWriterThe binary writer to write to.
Remarks
For Beginners: This saves your formula to a file or stream so you can load it later.
SetActiveFeatureIndices(IEnumerable<int>)
Sets the active feature indices for this expression tree.
public virtual void SetActiveFeatureIndices(IEnumerable<int> featureIndices)
Parameters
featureIndicesIEnumerable<int>The feature indices to use.
Remarks
For Beginners: This restricts the formula to only use specific input variables. Any variables in the tree that are not in the active set will be replaced with constant zero values. This is useful for feature selection and understanding which variables are most important.
SetLeft(ExpressionTree<T, TInput, TOutput>?)
Sets the left child of this node and updates the parent reference of the child.
public void SetLeft(ExpressionTree<T, TInput, TOutput>? left)
Parameters
leftExpressionTree<T, TInput, TOutput>The node to set as the left child.
SetParameters(Vector<T>)
Sets the parameters (constant values) of this expression tree, modifying it in place.
public virtual void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>The new parameter values to assign to constant nodes.
Remarks
For Beginners: This method replaces all the constant numbers in your formula with new values, modifying the current tree directly. Unlike UpdateCoefficients and WithParameters which create new trees with the updated values, this method mutates the tree in place. Use this when you want to modify the tree directly, such as during optimization iterations.
Note: This implementation uses two tree traversals (counting and assignment) to validate parameter count BEFORE modifying the tree. This ensures atomicity: if the parameter count is wrong, the tree remains unchanged.
Exceptions
- ArgumentException
Thrown when the parameter count doesn't match the number of constant nodes.
SetRight(ExpressionTree<T, TInput, TOutput>?)
Sets the right child of this node and updates the parent reference of the child.
public void SetRight(ExpressionTree<T, TInput, TOutput>? right)
Parameters
rightExpressionTree<T, TInput, TOutput>The node to set as the right child.
SetType(ExpressionNodeType)
Sets the type of this node.
public void SetType(ExpressionNodeType type)
Parameters
typeExpressionNodeTypeThe node type to set.
SetValue(T)
Sets the value of this node.
public void SetValue(T value)
Parameters
valueTThe value to set.
ToString()
Returns a string representation of this expression tree.
public override string ToString()
Returns
- string
A string representing the mathematical expression.
Remarks
For Beginners: This converts the tree into a readable mathematical formula. For example, an addition node with children might return "(2 + x[0])".
Train(Matrix<T>, Vector<T>)
Trains the expression tree on the provided data.
public void Train(Matrix<T> x, Vector<T> y)
Parameters
xMatrix<T>The input features matrix.
yVector<T>The target values vector.
Remarks
For Beginners: For expression trees, "training" just validates that the formula can process your data. The formula itself doesn't learn or change during training - it's predefined by the tree structure.
Train(TInput, TOutput)
Trains the expression tree on a single input-output pair.
public void Train(TInput input, TOutput expectedOutput)
Parameters
inputTInputThe input data (Vector, Matrix, or Tensor).
expectedOutputTOutputThe expected output value.
Remarks
For Beginners: For expression trees, training doesn't actually change the formula. This method validates that the formula can process your input data correctly.
UpdateCoefficients(Vector<T>)
Creates a new expression tree with updated coefficient values.
public IFullModel<T, TInput, TOutput> UpdateCoefficients(Vector<T> newCoefficients)
Parameters
newCoefficientsVector<T>The new coefficient values to use.
Returns
- IFullModel<T, TInput, TOutput>
A new expression tree with the updated coefficients.
Remarks
For Beginners: This changes the constant numbers in your formula without changing its structure. For example, if your formula is "2x + 3", this might change it to "4x + 1" by updating the coefficients 2 and 3. This is useful when fine-tuning a model to make better predictions.
Exceptions
- ArgumentException
Thrown when the number of new coefficients doesn't match the current number.
WithParameters(Vector<T>)
Creates a new expression tree with updated parameters.
public IFullModel<T, TInput, TOutput> WithParameters(Vector<T> parameters)
Parameters
parametersVector<T>The new parameter values to use.
Returns
- IFullModel<T, TInput, TOutput>
A new expression tree with the updated parameters.
Remarks
For Beginners: This replaces all the constant numbers in your formula with new values. For example, changing "2x + 3" to "4x + 1" by providing [4, 1] as the new parameters. The structure of the formula stays the same.