Table of Contents

Class ExpressionTree<T, TInput, TOutput>

Namespace
AiDotNet.LinearAlgebra
Assembly
AiDotNet.dll

Represents a symbolic expression tree for mathematical operations that can be used for symbolic regression.

public class ExpressionTree<T, TInput, TOutput> : IFullModel<T, TInput, TOutput>, IModel<TInput, TOutput, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, TInput, TOutput>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, TInput, TOutput>>, IGradientComputable<T, TInput, TOutput>, IJitCompilable<T>

Type Parameters

T

The numeric type used in the expression tree (e.g., double, float).

TInput
TOutput
Inheritance
ExpressionTree<T, TInput, TOutput>
Implements
IFullModel<T, TInput, TOutput>
IModel<TInput, TOutput, ModelMetadata<T>>
IParameterizable<T, TInput, TOutput>
ICloneable<IFullModel<T, TInput, TOutput>>
IGradientComputable<T, TInput, TOutput>
Inherited Members
Extension Methods

Remarks

For Beginners: An ExpressionTree is like a mathematical formula represented as a tree structure. Each node in the tree is either a number, a variable, or an operation (like addition or multiplication). This allows the AI to create and evolve mathematical formulas that can model your data.

Constructors

ExpressionTree(ExpressionNodeType, T?, ExpressionTree<T, TInput, TOutput>?, ExpressionTree<T, TInput, TOutput>?, ILossFunction<T>?)

Creates a new expression tree node with the specified properties.

public ExpressionTree(ExpressionNodeType type, T? value = default, ExpressionTree<T, TInput, TOutput>? left = null, ExpressionTree<T, TInput, TOutput>? right = null, ILossFunction<T>? lossFunction = null)

Parameters

type ExpressionNodeType

The type of node to create.

value T

The value for this node (for constants and variables).

left ExpressionTree<T, TInput, TOutput>

The left child node.

right ExpressionTree<T, TInput, TOutput>

The right child node.

lossFunction ILossFunction<T>

Optional loss function to use for training. If null, uses Mean Squared Error (MSE) for symbolic regression.

Remarks

For Beginners: This creates a new part of your mathematical formula. You can create simple nodes (like numbers or variables) or operation nodes (like addition or multiplication) that connect to other nodes.

Properties

Coefficients

Gets a vector containing all coefficient values in this expression tree.

public Vector<T> Coefficients { get; }

Property Value

Vector<T>

Remarks

For Beginners: This collects all the constant numbers from your formula into a list. For example, if your formula is "2x + 3y + 5", this would give you [2, 3, 5]. These numbers are called "coefficients" and are important when optimizing your model.

Complexity

Gets the complexity of this expression tree, measured as the total number of nodes.

public int Complexity { get; }

Property Value

int

Remarks

For Beginners: Complexity tells you how complicated the formula is. A higher number means a more complex formula with more terms and operations.

DefaultLossFunction

Gets the default loss function used by this model for gradient computation.

public ILossFunction<T> DefaultLossFunction { get; }

Property Value

ILossFunction<T>

Remarks

For ExpressionTree (symbolic regression), the default loss function is Mean Squared Error (MSE), which is the standard loss function for regression problems.

FeatureCount

Gets the number of features (variables) used in this expression tree.

public int FeatureCount { get; }

Property Value

int

Remarks

For Beginners: This tells you how many different input variables your formula uses. For example, if your formula uses x[0], x[1], and x[2], the feature count would be 3.

Id

Gets the unique identifier for this node.

public int Id { get; }

Property Value

int

Remarks

For Beginners: This is a unique number assigned to each part of your formula, making it easy to identify and reference specific parts of the expression tree.

Left

Gets the left child node of this node.

public ExpressionTree<T, TInput, TOutput>? Left { get; }

Property Value

ExpressionTree<T, TInput, TOutput>

Remarks

For Beginners: In operations like addition (a + b), the left child represents 'a'.

ParameterCount

Gets the number of parameters (constant nodes) in this expression tree.

public virtual int ParameterCount { get; }

Property Value

int

Remarks

For Beginners: This tells you how many constant values are in your formula. For example, if your formula is "2x + 3y + 5", there are 3 parameters: 2, 3, and 5. This value is obtained from the Coefficients property, which returns a vector of all constant values.

Parent

Gets the parent node of this node.

public ExpressionTree<T, TInput, TOutput>? Parent { get; }

Property Value

ExpressionTree<T, TInput, TOutput>

RequiredFeatureCount

Gets the minimum number of features required for input data to this expression tree.

public int RequiredFeatureCount { get; }

Property Value

int

Remarks

For Beginners: This tells you the minimum number of columns your input data must have. It equals the maximum variable index used plus one. For example, if your formula uses x[5], the required feature count is 6 (indices 0 through 5).

Note: This is different from FeatureCount which counts unique variables used. A tree using only x[5] has FeatureCount=1 but RequiredFeatureCount=6.

Right

Gets the right child node of this node.

public ExpressionTree<T, TInput, TOutput>? Right { get; }

Property Value

ExpressionTree<T, TInput, TOutput>

Remarks

For Beginners: In operations like addition (a + b), the right child represents 'b'.

SupportsJitCompilation

Gets whether this expression tree supports JIT compilation.

public bool SupportsJitCompilation { get; }

Property Value

bool

True - expression trees are inherently computation graphs and support JIT compilation.

Remarks

Expression trees are already symbolic computation graphs, making them ideal for JIT compilation. The tree structure directly represents the mathematical operations to be performed, which can be compiled into optimized native code.

For Beginners: Expression trees are like ready-made recipes for JIT compilation.

Since an expression tree already describes your formula as a series of operations (add, multiply, etc.), the JIT compiler can:

  • Convert it directly to fast machine code
  • Optimize common patterns (e.g., constant folding)
  • Inline operations for better performance

This provides 2-5x speedup for complex symbolic expressions.

Type

Gets the type of this node (constant, variable, or operation).

public ExpressionNodeType Type { get; }

Property Value

ExpressionNodeType

Value

Gets the value stored in this node. For constants, this is the actual value. For variables, this is the index of the variable in the input vector.

public T Value { get; }

Property Value

T

Methods

ApplyGradients(Vector<T>, T)

Applies pre-computed gradients to update the model parameters (constants in the expression tree).

public void ApplyGradients(Vector<T> gradients, T learningRate)

Parameters

gradients Vector<T>

The gradient vector to apply.

learningRate T

The learning rate for the update.

Remarks

Updates constants using: constant = constant - learningRate * gradient

For Beginners: After computing gradients (seeing which direction to adjust each constant), this method actually adjusts them. The learning rate controls how big of an adjustment to make.

Exceptions

ArgumentNullException

If gradients is null.

ArgumentException

If gradient vector length doesn't match parameter count.

Clone()

Creates a clone of this expression tree.

public IFullModel<T, TInput, TOutput> Clone()

Returns

IFullModel<T, TInput, TOutput>

A new expression tree with the same structure and values.

Remarks

For Beginners: This creates an exact duplicate of your formula. It's essentially the same as DeepCopy and Copy - it makes a complete duplicate that you can modify without changing the original.

ComputeGradients(TInput, TOutput, ILossFunction<T>?)

Computes gradients of the loss function with respect to model parameters WITHOUT updating parameters.

public Vector<T> ComputeGradients(TInput input, TOutput target, ILossFunction<T>? lossFunction = null)

Parameters

input TInput

The input data.

target TOutput

The target/expected output.

lossFunction ILossFunction<T>

The loss function to use. If null, uses the model's default loss function.

Returns

Vector<T>

A vector containing gradients with respect to all model parameters (constants in the expression tree).

Remarks

This method computes gradients using numerical differentiation (finite differences). For each constant in the expression tree, it slightly perturbs the value and measures how the loss changes, approximating the gradient.

For Beginners: This calculates how to adjust each constant in your mathematical formula to reduce error. Since expression trees are symbolic, we use a numerical approximation: we slightly change each constant and see how much the error changes.

Exceptions

ArgumentNullException

If input or target is null.

Copy()

Creates a deep copy of this expression tree.

public IFullModel<T, TInput, TOutput> Copy()

Returns

IFullModel<T, TInput, TOutput>

A new expression tree with the same structure and values as this one.

Remarks

For Beginners: This creates an exact duplicate of the formula, like making a photocopy. This is important because we often need to make changes to a formula without modifying the original.

Crossover(IFullModel<T, TInput, TOutput>, double)

Combines this expression tree with another to create a new "offspring" expression tree.

public IFullModel<T, TInput, TOutput> Crossover(IFullModel<T, TInput, TOutput> other, double crossoverRate)

Parameters

other IFullModel<T, TInput, TOutput>

The other expression tree to combine with.

crossoverRate double

The probability (0.0 to 1.0) that crossover will occur.

Returns

IFullModel<T, TInput, TOutput>

A new expression tree that combines parts from both parent trees.

Remarks

For Beginners: Crossover is like taking parts from two different formulas and combining them to create a new formula. For example, if one formula is (x + 2) and another is (y * 3), crossover might create (x * 3) by taking parts from each. This mimics how genetic traits are passed from parents to children in nature.

DeepCopy()

Creates a deep copy of this expression tree.

public IFullModel<T, TInput, TOutput> DeepCopy()

Returns

IFullModel<T, TInput, TOutput>

A new, identical expression tree.

Remarks

For Beginners: This creates an exact duplicate of the entire formula tree. Unlike the Copy method which returns a general IFullModel, this method returns a specific ExpressionTree. This is useful when you need to make changes to a copy without affecting the original formula.

Deserialize(byte[])

Loads an expression tree from a byte array, replacing the current tree's structure.

public void Deserialize(byte[] data)

Parameters

data byte[]

The byte array containing the serialized expression tree.

Remarks

For Beginners: This loads a previously saved formula from a compact format and replaces the current formula with it. It's like opening a saved file and loading its contents.

Deserialize(BinaryReader)

Deserializes an expression tree from a binary reader.

public ExpressionTree<T, TInput, TOutput> Deserialize(BinaryReader reader)

Parameters

reader BinaryReader

The binary reader containing the serialized tree data.

Returns

ExpressionTree<T, TInput, TOutput>

A new ExpressionTree instance created from the serialized data.

Remarks

For Beginners: This method reads a saved expression tree from binary data and reconstructs it. Think of it like opening a saved file that contains your mathematical formula.

Evaluate(Vector<T>)

Evaluates this expression tree for a given input vector.

public T Evaluate(Vector<T> input)

Parameters

input Vector<T>

The input vector containing values for variables.

Returns

T

The result of evaluating the expression.

Remarks

For Beginners: This calculates the result of your formula for a specific set of input values. For example, if your formula is "2x[0] + x[1]" and your input is [3, 4], the result would be 23 + 4 = 10.

ExportComputationGraph(List<ComputationNode<T>>)

Exports the expression tree as a computation graph for JIT compilation.

public ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

List to populate with input computation nodes (variables and constants).

Returns

ComputationNode<T>

The root computation node representing the complete expression.

Remarks

This method converts the expression tree into a computation graph by: 1. Creating variable nodes for each unique variable in the tree 2. Recursively building the computation graph from the tree structure 3. Adding all input nodes (variables) to the inputNodes list

For Beginners: This converts your symbolic formula into a computation graph.

For example, the expression tree representing "(x[0] * 2) + x[1]" becomes:

  • Variable node for x[0]
  • Constant node for 2
  • Multiply node connecting them
  • Variable node for x[1]
  • Add node combining the multiply result with x[1]

The JIT compiler then optimizes this graph and generates fast code.

Note: Only variables are added to inputNodes. Constants are embedded in the graph.

Exceptions

ArgumentNullException

Thrown when inputNodes is null.

FindNodeById(int)

Finds a node in the tree by its unique identifier.

public ExpressionTree<T, TInput, TOutput>? FindNodeById(int id)

Parameters

id int

The unique identifier of the node to find.

Returns

ExpressionTree<T, TInput, TOutput>

The node with the specified ID, or null if no such node exists.

Remarks

For Beginners: Every part of your formula has a unique ID number. This method helps you find a specific part by its ID, like finding a person by their social security number.

Fit(Matrix<T>, Vector<T>)

Fits the expression tree to the provided training data.

public void Fit(Matrix<T> X, Vector<T> y)

Parameters

X Matrix<T>

The input features matrix.

y Vector<T>

The target values vector.

Remarks

For Beginners: For expression trees, "fitting" just checks if the formula can work with your data. Unlike other AI models, the formula itself doesn't change during fitting - it's predefined by the tree structure.

GetActiveFeatureIndices()

Gets the indices of all features (variables) used in this expression tree.

public IEnumerable<int> GetActiveFeatureIndices()

Returns

IEnumerable<int>

A collection of feature indices.

Remarks

For Beginners: This tells you which input variables are actually used in your formula. For example, if your formula only uses x[0] and x[2], this returns [0, 2], showing that the formula uses the first and third variables but not the second one.

GetAllNodes()

Gets a list of all nodes in this expression tree.

public List<ExpressionTree<T, TInput, TOutput>> GetAllNodes()

Returns

List<ExpressionTree<T, TInput, TOutput>>

A list containing all nodes in the tree.

Remarks

For Beginners: This collects all the parts of your formula into a list. For example, if your formula is (x + 2) * y, this would give you a list containing: the multiplication operation, the addition operation, the x variable, the constant 2, and the y variable.

GetFeatureImportance()

Gets the feature importance scores for this expression tree.

public virtual Dictionary<string, T> GetFeatureImportance()

Returns

Dictionary<string, T>

A dictionary mapping feature names to importance scores.

Remarks

For Beginners: Feature importance tells you which input variables matter most in your formula. For expression trees, importance is calculated by counting how many times each variable appears in the formula. Variables that appear more frequently are considered more important.

GetModelMetadata()

Gets metadata about this expression tree model.

public ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

A ModelMetadata object containing information about this model.

Remarks

For Beginners: This provides useful information about your formula, like how complex it is and how many input variables it needs. Think of it as a summary sheet about your mathematical model.

GetParameters()

Gets the parameters of this expression tree.

public Vector<T> GetParameters()

Returns

Vector<T>

A vector containing all coefficient values in this expression tree.

Remarks

For Beginners: This returns all the constant numbers from your formula. For example, if your formula is "2x + 3y + 5", this would give you [2, 3, 5]. These numbers are the adjustable parameters that can be tuned to improve predictions.

IsFeatureUsed(int)

Checks if a specific feature (variable) is used in this expression tree.

public bool IsFeatureUsed(int featureIndex)

Parameters

featureIndex int

The index of the feature to check.

Returns

bool

True if the feature is used, false otherwise.

Remarks

For Beginners: This checks if your formula uses a specific input variable. For example, if featureIndex is 2, it checks if x[2] appears anywhere in your formula.

LoadModel(string)

Loads an expression tree model from a file.

public virtual void LoadModel(string filePath)

Parameters

filePath string

The path to the file containing the saved model.

Remarks

For Beginners: This loads a previously saved formula from a file, allowing you to reuse it without recreating it. The loaded formula can immediately be used for predictions.

LoadState(Stream)

Loads the expression tree's state (structure and values) from a stream.

public virtual void LoadState(Stream stream)

Parameters

stream Stream

The stream to read the expression tree state from.

Remarks

This method deserializes expression tree state that was previously saved with SaveState, restoring the complete tree structure, node types, values, and connections. It uses the existing Deserialize method after reading data from the stream.

For Beginners: This is like loading a saved snapshot of your mathematical formula.

When you call LoadState:

  • The tree structure is read from the stream
  • All node types and values are restored
  • The formula becomes identical to when SaveState was called

After loading, the formula can:

  • Make predictions using the restored structure
  • Continue evolving during optimization
  • Be used for symbolic regression or genetic programming

This is essential for:

  • Resuming interrupted evolutionary training
  • Loading the best formula after optimization
  • Deploying symbolic models to production
  • Knowledge distillation with interpretable models

Exceptions

ArgumentNullException

Thrown when stream is null.

IOException

Thrown when there's an error reading from the stream.

InvalidOperationException

Thrown when the stream contains invalid or incompatible data.

Mutate(double)

Creates a modified version of this expression tree by applying random mutations.

public IFullModel<T, TInput, TOutput> Mutate(double mutationRate)

Parameters

mutationRate double

The probability (0.0 to 1.0) that a mutation will occur at each node.

Returns

IFullModel<T, TInput, TOutput>

A new expression tree with mutations applied.

Remarks

For Beginners: Mutation is like making small random changes to a formula to see if it improves. For example, changing a "+" to a "*" or changing a constant from 2.5 to 3.1. This is inspired by how genetic mutations work in nature and helps the AI explore different solutions.

Predict(Matrix<T>)

Makes predictions using this expression tree for multiple input samples.

public Vector<T> Predict(Matrix<T> input)

Parameters

input Matrix<T>

A matrix where each row represents a sample and each column represents a feature.

Returns

Vector<T>

A vector containing the predicted values for each input sample.

Remarks

For Beginners: This method takes your data (like height, weight, age values) and runs each row through the mathematical formula represented by this tree to get predictions. For example, if your tree represents "2x + y", and your input has values [3,4], the prediction would be 2*3 + 4 = 10.

Note: If the input has more features than the model requires, the extra features are allowed but ignored. Only the features up to RequiredFeatureCount are used in predictions. This flexibility supports transfer learning scenarios where input data may contain additional features not used by this particular model.

Exceptions

ArgumentException

Thrown when the input matrix has incorrect dimensions.

Predict(TInput)

Makes a prediction for an input example.

public TOutput Predict(TInput input)

Parameters

input TInput

The input data (Vector, Matrix, or Tensor).

Returns

TOutput

The predicted output.

Remarks

For Beginners: This method applies your mathematical formula to the input data to calculate a prediction. It handles different types of inputs (vectors, matrices, or tensors).

SaveModel(string)

Saves the expression tree model to a file.

public virtual void SaveModel(string filePath)

Parameters

filePath string

The path where the model should be saved.

Remarks

For Beginners: This saves your mathematical formula to a file so you can load it later without having to recreate it. The file contains the tree structure, all node types, and values.

SaveState(Stream)

Saves the expression tree's current state (structure and values) to a stream.

public virtual void SaveState(Stream stream)

Parameters

stream Stream

The stream to write the expression tree state to.

Remarks

This method serializes the complete expression tree structure, including all node types, values, and connections. It uses the existing Serialize method and writes the data to the provided stream.

For Beginners: This is like creating a snapshot of your mathematical formula.

When you call SaveState:

  • The entire tree structure is written to the stream
  • All node types (constants, variables, operations) are preserved
  • All values and connections are saved

This is particularly useful for:

  • Checkpointing during evolutionary algorithm training
  • Knowledge distillation with symbolic models
  • Saving the best formula found during optimization
  • Creating formula ensembles

You can later use LoadState to restore the formula to this exact state.

Exceptions

ArgumentNullException

Thrown when stream is null.

IOException

Thrown when there's an error writing to the stream.

Serialize()

Converts this expression tree to a byte array for storage or transmission.

public byte[] Serialize()

Returns

byte[]

A byte array representing the serialized expression tree.

Remarks

For Beginners: This converts your mathematical formula into a compact format that can be saved to a file or sent over the internet. It's like zipping up your formula for storage.

Serialize(BinaryWriter)

Writes this expression tree to a binary stream.

public void Serialize(BinaryWriter writer)

Parameters

writer BinaryWriter

The binary writer to write to.

Remarks

For Beginners: This saves your formula to a file or stream so you can load it later.

SetActiveFeatureIndices(IEnumerable<int>)

Sets the active feature indices for this expression tree.

public virtual void SetActiveFeatureIndices(IEnumerable<int> featureIndices)

Parameters

featureIndices IEnumerable<int>

The feature indices to use.

Remarks

For Beginners: This restricts the formula to only use specific input variables. Any variables in the tree that are not in the active set will be replaced with constant zero values. This is useful for feature selection and understanding which variables are most important.

SetLeft(ExpressionTree<T, TInput, TOutput>?)

Sets the left child of this node and updates the parent reference of the child.

public void SetLeft(ExpressionTree<T, TInput, TOutput>? left)

Parameters

left ExpressionTree<T, TInput, TOutput>

The node to set as the left child.

SetParameters(Vector<T>)

Sets the parameters (constant values) of this expression tree, modifying it in place.

public virtual void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

The new parameter values to assign to constant nodes.

Remarks

For Beginners: This method replaces all the constant numbers in your formula with new values, modifying the current tree directly. Unlike UpdateCoefficients and WithParameters which create new trees with the updated values, this method mutates the tree in place. Use this when you want to modify the tree directly, such as during optimization iterations.

Note: This implementation uses two tree traversals (counting and assignment) to validate parameter count BEFORE modifying the tree. This ensures atomicity: if the parameter count is wrong, the tree remains unchanged.

Exceptions

ArgumentException

Thrown when the parameter count doesn't match the number of constant nodes.

SetRight(ExpressionTree<T, TInput, TOutput>?)

Sets the right child of this node and updates the parent reference of the child.

public void SetRight(ExpressionTree<T, TInput, TOutput>? right)

Parameters

right ExpressionTree<T, TInput, TOutput>

The node to set as the right child.

SetType(ExpressionNodeType)

Sets the type of this node.

public void SetType(ExpressionNodeType type)

Parameters

type ExpressionNodeType

The node type to set.

SetValue(T)

Sets the value of this node.

public void SetValue(T value)

Parameters

value T

The value to set.

ToString()

Returns a string representation of this expression tree.

public override string ToString()

Returns

string

A string representing the mathematical expression.

Remarks

For Beginners: This converts the tree into a readable mathematical formula. For example, an addition node with children might return "(2 + x[0])".

Train(Matrix<T>, Vector<T>)

Trains the expression tree on the provided data.

public void Train(Matrix<T> x, Vector<T> y)

Parameters

x Matrix<T>

The input features matrix.

y Vector<T>

The target values vector.

Remarks

For Beginners: For expression trees, "training" just validates that the formula can process your data. The formula itself doesn't learn or change during training - it's predefined by the tree structure.

Train(TInput, TOutput)

Trains the expression tree on a single input-output pair.

public void Train(TInput input, TOutput expectedOutput)

Parameters

input TInput

The input data (Vector, Matrix, or Tensor).

expectedOutput TOutput

The expected output value.

Remarks

For Beginners: For expression trees, training doesn't actually change the formula. This method validates that the formula can process your input data correctly.

UpdateCoefficients(Vector<T>)

Creates a new expression tree with updated coefficient values.

public IFullModel<T, TInput, TOutput> UpdateCoefficients(Vector<T> newCoefficients)

Parameters

newCoefficients Vector<T>

The new coefficient values to use.

Returns

IFullModel<T, TInput, TOutput>

A new expression tree with the updated coefficients.

Remarks

For Beginners: This changes the constant numbers in your formula without changing its structure. For example, if your formula is "2x + 3", this might change it to "4x + 1" by updating the coefficients 2 and 3. This is useful when fine-tuning a model to make better predictions.

Exceptions

ArgumentException

Thrown when the number of new coefficients doesn't match the current number.

WithParameters(Vector<T>)

Creates a new expression tree with updated parameters.

public IFullModel<T, TInput, TOutput> WithParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

The new parameter values to use.

Returns

IFullModel<T, TInput, TOutput>

A new expression tree with the updated parameters.

Remarks

For Beginners: This replaces all the constant numbers in your formula with new values. For example, changing "2x + 3" to "4x + 1" by providing [4, 1] as the new parameters. The structure of the formula stays the same.