Class ComputationNode<T>
Represents a node in the automatic differentiation computation graph.
public class ComputationNode<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
ComputationNode<T>
- Inherited Members
- Extension Methods
Remarks
A ComputationNode is a fundamental building block of automatic differentiation. It represents a value in a computational graph, along with information about how to compute gradients with respect to that value. Each node stores its value, gradient, parent nodes (inputs), and a backward function for gradient computation.
For Beginners: This represents a single step in a calculation that can be differentiated.
Think of it like this:
- A node stores a value (like the output of adding two numbers)
- It remembers what inputs were used to create this value (the two numbers)
- It knows how to calculate gradients (derivatives) with respect to its inputs
- Connecting nodes together forms a graph that tracks the entire calculation
This enables automatic differentiation, where gradients can be computed automatically for complex operations by chaining together simple derivatives.
Constructors
ComputationNode(Tensor<T>, bool, List<ComputationNode<T>>?, Action<Tensor<T>>?, string?)
Initializes a new instance of the ComputationNode<T> class.
public ComputationNode(Tensor<T> value, bool requiresGradient = false, List<ComputationNode<T>>? parents = null, Action<Tensor<T>>? backwardFunction = null, string? name = null)
Parameters
valueTensor<T>The value stored in this node.
requiresGradientboolWhether this node requires gradient computation.
parentsList<ComputationNode<T>>The parent nodes that were used to compute this value.
backwardFunctionAction<Tensor<T>>The function to compute gradients during backpropagation.
namestringOptional name for this node.
Remarks
Creates a new computation node with the specified properties. If requiresGradient is true and a backward function is provided, this node will participate in automatic differentiation.
For Beginners: This creates a new node in the computation graph.
When creating a node:
- Provide the computed value
- Specify if it needs gradients (usually true for parameters, false for constants)
- List the parent nodes (inputs) if any
- Provide a backward function if gradients are needed
- Optionally give it a name for debugging
Properties
BackwardFunction
Gets or sets the backward function that computes gradients for parent nodes.
public Action<Tensor<T>>? BackwardFunction { get; set; }
Property Value
- Action<Tensor<T>>
A function that takes the current gradient and computes parent gradients.
Remarks
The backward function implements the chain rule of calculus. Given the gradient at this node, it computes how much gradient should flow to each parent node. This is the core of automatic differentiation.
For Beginners: This function knows how to pass gradients backwards through this operation.
The backward function:
- Takes the gradient at this node as input
- Calculates what gradients should go to each parent
- Implements the mathematical rule for differentiating this operation
For example, for addition (c = a + b):
- The backward function would pass the gradient of c equally to both a and b
- For multiplication (c = a * b), it would multiply c's gradient by the other input's value
Gradient
Gets or sets the gradient accumulated at this node.
public Tensor<T>? Gradient { get; set; }
Property Value
- Tensor<T>
A tensor containing the gradient, or null if not yet computed.
Remarks
The gradient represents the derivative of the loss with respect to this node's value. It's computed during the backward pass and accumulated from all paths in the graph that use this node's output.
For Beginners: This shows how much the final result changes when this value changes.
The gradient tells you:
- How sensitive the final output is to this value
- Which direction to adjust this value to reduce the loss
- How much to adjust it (larger gradient = bigger adjustment needed)
Name
Gets or sets an optional name for this node (useful for debugging).
public string? Name { get; set; }
Property Value
- string
A string name for the node, or null if not named.
Remarks
Node names help with debugging and visualization of the computation graph. They're optional but recommended for important nodes like parameters.
OperationParams
Gets or sets additional operation-specific parameters (used for JIT compilation).
public Dictionary<string, object>? OperationParams { get; set; }
Property Value
- Dictionary<string, object>
A dictionary of parameter names to values, or null if not set.
Remarks
Some operations require additional parameters beyond their inputs. For example, convolution needs stride and padding, softmax needs an axis, etc. This dictionary stores those parameters for use by the JIT compiler.
For Beginners: This stores extra settings for operations.
For example:
- A Power operation might store {"Exponent": 2.0}
- A Softmax operation might store {"Axis": -1}
- A Conv2D operation might store {"Stride": [1, 1], "Padding": [0, 0]}
These parameters tell the JIT compiler exactly how the operation should behave, enabling it to generate the correct optimized code.
This is optional and only needed when using JIT compilation.
OperationType
Gets or sets the type of operation that created this node (used for JIT compilation).
public OperationType? OperationType { get; set; }
Property Value
- OperationType?
A string identifying the operation type (e.g., "Add", "MatMul", "ReLU"), or null if not set.
Remarks
This property is used by the JIT compiler to convert ComputationNode graphs to IR operations. It stores the name of the operation that produced this node's value, enabling the compiler to reconstruct the operation graph and optimize it for faster execution.
For Beginners: This records what operation created this node's value.
For example:
- If this node was created by adding two tensors, OperationType would be "Add"
- If created by matrix multiplication, OperationType would be "MatMul"
- If created by ReLU activation, OperationType would be "ReLU"
This information allows the JIT compiler to:
- Understand what operations are in the graph
- Optimize sequences of operations
- Generate fast compiled code
This is optional and only needed when using JIT compilation.
Parents
Gets or sets the parent nodes (inputs) that were used to compute this node's value.
public List<ComputationNode<T>> Parents { get; set; }
Property Value
- List<ComputationNode<T>>
A list of parent computation nodes.
Remarks
Parent nodes are the inputs to the operation that produced this node's value. During backpropagation, gradients flow from this node back to its parents.
For Beginners: These are the inputs that were combined to create this node's value.
For example, if this node represents c = a + b:
- a and b would be the parent nodes
- When computing gradients, we need to know how c's gradient affects a and b
- This parent list lets us trace back through the calculation
RequiresGradient
Gets or sets a value indicating whether this node requires gradient computation.
public bool RequiresGradient { get; set; }
Property Value
- bool
True if gradients should be computed for this node; false otherwise.
Remarks
This flag controls whether this node participates in gradient computation. Setting it to false can improve performance for constants or intermediate values that don't need gradients.
For Beginners: This controls whether to track gradients for this value.
Set to true for:
- Model parameters that need to be trained
- Values you want to optimize
Set to false for:
- Constants that never change
- Intermediate values you don't need gradients for (saves memory)
Value
Gets or sets the value stored in this node.
public Tensor<T> Value { get; set; }
Property Value
- Tensor<T>
A tensor containing the computed value.
Remarks
This is the forward pass result - the actual output of the operation that this node represents.
Methods
Backward()
Performs backward propagation from this node.
public void Backward()
Remarks
This method triggers backpropagation through the computation graph starting from this node. It performs a topological sort to determine the correct order for computing gradients, then calls each node's backward function in reverse topological order.
For Beginners: This computes gradients for all nodes that led to this one.
The backward process:
- Starts with this node's gradient (usually set to 1 for the final loss)
- Works backwards through the graph in the right order
- Each node passes gradients to its parents
- Gradients accumulate if a node has multiple children
This is how neural networks learn - by computing gradients and using them to update parameters.
ZeroGradient()
Zeros out the gradient for this node.
public void ZeroGradient()
Remarks
This method resets the gradient to zero. It should be called between training iterations to prevent gradient accumulation across batches (unless intentional).
For Beginners: This clears the gradient to prepare for the next calculation.
When to use:
- Before each new training iteration
- When you want to start fresh gradient computation
- To prevent gradients from adding up across multiple backward passes
In most training loops, you zero gradients at the start of each batch.
ZeroGradientRecursive()
Recursively zeros out gradients for this node and all its ancestors.
public void ZeroGradientRecursive()
Remarks
This method traverses the computation graph and zeros all gradients. It's useful for clearing the entire graph between training iterations.
For Beginners: This clears gradients for this node and everything it depends on.
Use this when:
- You want to reset the entire computation graph
- Starting a new training iteration
- You need to ensure no old gradients remain
This is more thorough than ZeroGradient() as it clears the whole graph.