Class ComputationNode<T>

Namespace: AiDotNet.Autodiff

Assembly: AiDotNet.dll

Represents a node in the automatic differentiation computation graph.

public class ComputationNode<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

ComputationNode<T>

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: CheckpointingExtensions.WithCheckpoint<T>(ComputationNode<T>, Func<ComputationNode<T>, ComputationNode<T>>)

CheckpointingExtensions.WithSequentialCheckpoint<T>(ComputationNode<T>, IReadOnlyList<Func<ComputationNode<T>, ComputationNode<T>>>, int)

Remarks

A ComputationNode is a fundamental building block of automatic differentiation. It represents a value in a computational graph, along with information about how to compute gradients with respect to that value. Each node stores its value, gradient, parent nodes (inputs), and a backward function for gradient computation.

For Beginners: This represents a single step in a calculation that can be differentiated.

Think of it like this:

A node stores a value (like the output of adding two numbers)
It remembers what inputs were used to create this value (the two numbers)
It knows how to calculate gradients (derivatives) with respect to its inputs
Connecting nodes together forms a graph that tracks the entire calculation

This enables automatic differentiation, where gradients can be computed automatically for complex operations by chaining together simple derivatives.

Constructors

ComputationNode(Tensor<T>, bool, List<ComputationNode<T>>?, Action<Tensor<T>>?, string?)

Initializes a new instance of the ComputationNode<T> class.

public ComputationNode(Tensor<T> value, bool requiresGradient = false, List<ComputationNode<T>>? parents = null, Action<Tensor<T>>? backwardFunction = null, string? name = null)

Parameters

value Tensor<T>: The value stored in this node.
requiresGradient bool: Whether this node requires gradient computation.
parents List<ComputationNode<T>>: The parent nodes that were used to compute this value.
backwardFunction Action<Tensor<T>>: The function to compute gradients during backpropagation.
name string: Optional name for this node.

Remarks

Creates a new computation node with the specified properties. If requiresGradient is true and a backward function is provided, this node will participate in automatic differentiation.

For Beginners: This creates a new node in the computation graph.

When creating a node:

Provide the computed value
Specify if it needs gradients (usually true for parameters, false for constants)
List the parent nodes (inputs) if any
Provide a backward function if gradients are needed
Optionally give it a name for debugging

Properties

BackwardFunction

Gets or sets the backward function that computes gradients for parent nodes.

public Action<Tensor<T>>? BackwardFunction { get; set; }

Property Value

Action<Tensor<T>>: A function that takes the current gradient and computes parent gradients.

Remarks

The backward function implements the chain rule of calculus. Given the gradient at this node, it computes how much gradient should flow to each parent node. This is the core of automatic differentiation.

For Beginners: This function knows how to pass gradients backwards through this operation.

The backward function:

Takes the gradient at this node as input
Calculates what gradients should go to each parent
Implements the mathematical rule for differentiating this operation

For example, for addition (c = a + b):

The backward function would pass the gradient of c equally to both a and b
For multiplication (c = a * b), it would multiply c's gradient by the other input's value

Gradient

Gets or sets the gradient accumulated at this node.

public Tensor<T>? Gradient { get; set; }

Property Value

Tensor<T>: A tensor containing the gradient, or null if not yet computed.

Remarks

The gradient represents the derivative of the loss with respect to this node's value. It's computed during the backward pass and accumulated from all paths in the graph that use this node's output.

For Beginners: This shows how much the final result changes when this value changes.

The gradient tells you:

How sensitive the final output is to this value
Which direction to adjust this value to reduce the loss
How much to adjust it (larger gradient = bigger adjustment needed)

Name

Gets or sets an optional name for this node (useful for debugging).

public string? Name { get; set; }

Property Value

string: A string name for the node, or null if not named.

Remarks

Node names help with debugging and visualization of the computation graph. They're optional but recommended for important nodes like parameters.

OperationParams

Gets or sets additional operation-specific parameters (used for JIT compilation).

public Dictionary<string, object>? OperationParams { get; set; }

Property Value

Dictionary<string, object>: A dictionary of parameter names to values, or null if not set.

Remarks

Some operations require additional parameters beyond their inputs. For example, convolution needs stride and padding, softmax needs an axis, etc. This dictionary stores those parameters for use by the JIT compiler.

For Beginners: This stores extra settings for operations.

For example:

A Power operation might store {"Exponent": 2.0}
A Softmax operation might store {"Axis": -1}
A Conv2D operation might store {"Stride": [1, 1], "Padding": [0, 0]}

These parameters tell the JIT compiler exactly how the operation should behave, enabling it to generate the correct optimized code.

This is optional and only needed when using JIT compilation.

OperationType

Gets or sets the type of operation that created this node (used for JIT compilation).

public OperationType? OperationType { get; set; }

Property Value

OperationType?: A string identifying the operation type (e.g., "Add", "MatMul", "ReLU"), or null if not set.

Remarks

This property is used by the JIT compiler to convert ComputationNode graphs to IR operations. It stores the name of the operation that produced this node's value, enabling the compiler to reconstruct the operation graph and optimize it for faster execution.

For Beginners: This records what operation created this node's value.

For example:

If this node was created by adding two tensors, OperationType would be "Add"
If created by matrix multiplication, OperationType would be "MatMul"
If created by ReLU activation, OperationType would be "ReLU"

This information allows the JIT compiler to:

Understand what operations are in the graph
Optimize sequences of operations
Generate fast compiled code

This is optional and only needed when using JIT compilation.

Parents

Gets or sets the parent nodes (inputs) that were used to compute this node's value.

public List<ComputationNode<T>> Parents { get; set; }

Property Value

List<ComputationNode<T>>: A list of parent computation nodes.

Remarks

Parent nodes are the inputs to the operation that produced this node's value. During backpropagation, gradients flow from this node back to its parents.

For Beginners: These are the inputs that were combined to create this node's value.

For example, if this node represents c = a + b:

a and b would be the parent nodes
When computing gradients, we need to know how c's gradient affects a and b
This parent list lets us trace back through the calculation

RequiresGradient

Gets or sets a value indicating whether this node requires gradient computation.

public bool RequiresGradient { get; set; }

Property Value

bool: True if gradients should be computed for this node; false otherwise.

Remarks

This flag controls whether this node participates in gradient computation. Setting it to false can improve performance for constants or intermediate values that don't need gradients.

For Beginners: This controls whether to track gradients for this value.

Set to true for:

Model parameters that need to be trained
Values you want to optimize

Set to false for:

Constants that never change
Intermediate values you don't need gradients for (saves memory)

Value

Gets or sets the value stored in this node.

public Tensor<T> Value { get; set; }

Property Value

Tensor<T>: A tensor containing the computed value.

Remarks

This is the forward pass result - the actual output of the operation that this node represents.

Methods

Backward()

Performs backward propagation from this node.

public void Backward()

Remarks

This method triggers backpropagation through the computation graph starting from this node. It performs a topological sort to determine the correct order for computing gradients, then calls each node's backward function in reverse topological order.

For Beginners: This computes gradients for all nodes that led to this one.

The backward process:

Starts with this node's gradient (usually set to 1 for the final loss)
Works backwards through the graph in the right order
Each node passes gradients to its parents
Gradients accumulate if a node has multiple children

This is how neural networks learn - by computing gradients and using them to update parameters.

ZeroGradient()

Zeros out the gradient for this node.

public void ZeroGradient()

Remarks

This method resets the gradient to zero. It should be called between training iterations to prevent gradient accumulation across batches (unless intentional).

For Beginners: This clears the gradient to prepare for the next calculation.

When to use:

Before each new training iteration
When you want to start fresh gradient computation
To prevent gradients from adding up across multiple backward passes

In most training loops, you zero gradients at the start of each batch.

ZeroGradientRecursive()

Recursively zeros out gradients for this node and all its ancestors.

public void ZeroGradientRecursive()

Remarks

This method traverses the computation graph and zeros all gradients. It's useful for clearing the entire graph between training iterations.

For Beginners: This clears gradients for this node and everything it depends on.

Use this when:

You want to reset the entire computation graph
Starting a new training iteration
You need to ensure no old gradients remain

This is more thorough than ZeroGradient() as it clears the whole graph.

Table of Contents

Class ComputationNode<T>

Type Parameters

Remarks

Constructors

ComputationNode(Tensor<T>, bool, List<ComputationNode<T>>?, Action<Tensor<T>>?, string?)

Parameters

Remarks

Properties

BackwardFunction

Property Value

Remarks

Gradient

Property Value

Remarks

Name

Property Value

Remarks

OperationParams

Property Value

Remarks

OperationType

Property Value

Remarks

Parents

Property Value

Remarks

RequiresGradient

Property Value

Remarks

Value

Property Value

Remarks

Methods

Backward()

Remarks

ZeroGradient()

Remarks

ZeroGradientRecursive()

Remarks