Table of Contents

Class DeepOperatorNetwork<T>

Namespace
AiDotNet.PhysicsInformed.NeuralOperators
Assembly
AiDotNet.dll

Implements Deep Operator Network (DeepONet) for learning operators.

public class DeepOperatorNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations.

Inheritance
DeepOperatorNetwork<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Inherited Members
Extension Methods

Remarks

For Beginners: DeepONet is another approach to learning operators (like FNO), but with a different architecture.

Universal Approximation Theorem for Operators: Just as neural networks can approximate any function, DeepONet can approximate any operator! This is based on a theorem by Chen and Chen (1995).

The Key Idea - Decomposition: DeepONet represents an operator G as: G(u)(y) = Σᵢ bᵢ(u) * tᵢ(y)

Where:

  • u is the input function
  • y is the query location
  • bᵢ(u) are "basis functions" of the input (learned by Branch Net)
  • tᵢ(y) are "basis functions" of the location (learned by Trunk Net)

Architecture: DeepONet has TWO networks:

  1. Branch Network:

    • Input: The entire input function u(x) (sampled at sensors)
    • Output: Coefficients b₁, b₂, ..., bₚ
    • Role: Encodes information about the input function
  2. Trunk Network:

    • Input: Query location y (where we want to evaluate output)
    • Output: Basis functions t₁(y), t₂(y), ..., tₚ(y)
    • Role: Encodes spatial/temporal patterns
  3. Combination:

    • Output: G(u)(y) = b · t = Σᵢ bᵢ * tᵢ(y)
    • Simple dot product of the two network outputs

Analogy: Think of it like a bilinear form or low-rank factorization:

  • Branch net learns "what" information matters in the input
  • Trunk net learns "where" patterns occur spatially
  • Their interaction gives the output

Example - Heat Equation: Problem: Given initial temperature u(x,0), find temperature u(x,t)

Branch Net:

  • Input: u(x,0) sampled at many points → [u(x₁,0), u(x₂,0), ..., u(xₙ,0)]
  • Learns: "This initial condition is smooth/peaked/oscillatory"
  • Output: Coefficients [b₁, b₂, ..., bₚ]

Trunk Net:

  • Input: (x, t) where we want to know the temperature
  • Learns: Spatial-temporal basis functions
  • Output: Basis values [t₁(x,t), t₂(x,t), ..., tₚ(x,t)]

Result: u(x,t) = Σᵢ bᵢ * tᵢ(x,t)

Key Advantages:

  1. Sensor flexibility: Can use different sensor locations at test time
  2. Query flexibility: Can evaluate at any location y
  3. Theoretical foundation: Universal approximation theorem
  4. Efficient: Once trained, very fast evaluation
  5. Interpretable: Decomposition into branch/trunk has clear meaning

Comparison with FNO: DeepONet:

  • Works on unstructured data (any sensor locations)
  • More flexible for irregular domains
  • Requires specifying sensor locations
  • Good for problems with sparse/irregular data

FNO:

  • Works on structured grids
  • Uses FFT (very efficient)
  • Resolution-invariant
  • Good for periodic/regular problems

Both are powerful, choice depends on your problem!

Applications:

  • Same as FNO: PDEs, climate, fluids, etc.
  • Particularly good for:
    • Inverse problems (finding unknown parameters)
    • Problems with sparse measurements
    • Irregular geometries
    • Multi-scale phenomena

Historical Note: DeepONet was introduced by Lu et al. (2021) and has been highly successful in learning solution operators for PDEs with theoretical guarantees.

Constructors

DeepOperatorNetwork(NeuralNetworkArchitecture<T>, NeuralNetworkArchitecture<T>, NeuralNetworkArchitecture<T>, int, int, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?)

Initializes a new instance of DeepONet.

public DeepOperatorNetwork(NeuralNetworkArchitecture<T> architecture, NeuralNetworkArchitecture<T> branchArchitecture, NeuralNetworkArchitecture<T> trunkArchitecture, int latentDimension = 128, int numSensors = 100, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? optimizer = null)

Parameters

architecture NeuralNetworkArchitecture<T>

The overall architecture (mainly for metadata).

branchArchitecture NeuralNetworkArchitecture<T>

Architecture for the branch network.

trunkArchitecture NeuralNetworkArchitecture<T>

Architecture for the trunk network.

latentDimension int

Dimension p of the latent space (number of basis functions).

numSensors int

Number of sensor locations where input function is sampled.

optimizer IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>

Remarks

For Beginners:

Parameters:

latentDimension (p): Number of basis functions

  • Controls the expressiveness of the operator
  • Higher p = more expressive but more parameters
  • Typical: 100-400
  • Like the rank in low-rank matrix factorization

numSensors: How many points to sample the input function

  • More sensors = more information about input
  • Must be enough to capture important features
  • Typical: 50-200
  • Can use different sensor locations at train vs. test time!

Branch Network:

  • Input size: numSensors (values of input function at sensors)
  • Output size: latentDimension (p)
  • Architecture: Deep feedforward network
  • Typical: 3-5 layers, 100-200 neurons per layer

Trunk Network:

  • Input size: dimension of query location (e.g., 2 for (x,y), 3 for (x,y,t))
  • Output size: latentDimension (p)
  • Architecture: Deep feedforward network
  • Typical: 3-5 layers, 100-200 neurons per layer

Properties

ParameterCount

Gets the total number of parameters across branch and trunk networks.

public override int ParameterCount { get; }

Property Value

int

SupportsJitCompilation

Gets whether this model currently supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

True if the model can be JIT compiled, false otherwise.

Remarks

Some models may not support JIT compilation due to: - Dynamic graph structure (changes based on input) - Lack of computation graph representation - Use of operations not yet supported by the JIT compiler

For Beginners: This tells you whether this specific model can benefit from JIT compilation.

Models return false if they:

  • Use layer-based architecture without graph export (e.g., current neural networks)
  • Have control flow that changes based on input data
  • Use operations the JIT compiler doesn't understand yet

In these cases, the model will still work normally, just without JIT acceleration.

SupportsTraining

Indicates whether this network supports training (learning from data).

public override bool SupportsTraining { get; }

Property Value

bool

Remarks

For Beginners: Not all neural networks can learn. Some are designed only for making predictions with pre-set parameters. This property tells you if the network can learn from data.

Methods

CreateNewInstance()

Creates a new instance with the same configuration.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

New DeepONet instance.

DeserializeNetworkSpecificData(BinaryReader)

Deserializes DeepONet-specific data.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader

Binary reader.

Evaluate(T[], T[])

Evaluates the operator: G(u)(y) = branch(u) · trunk(y).

public T Evaluate(T[] inputFunction, T[] queryLocation)

Parameters

inputFunction T[]

Values of input function at sensor locations [numSensors].

queryLocation T[]

Location where to evaluate output [spatialDim].

Returns

T

Output value at the query location.

Remarks

For Beginners: This is the forward pass of DeepONet.

Steps:

  1. Pass input function values through branch network → get coefficients b
  2. Pass query location through trunk network → get basis functions t
  3. Compute dot product: output = b · t

Example:

  • inputFunction = [0.5, 0.7, 0.3, ...] (100 values) → branch net → [b₁, b₂, ..., b₁₂₈] (128 coefficients)

  • queryLocation = [0.3, 0.5] (x=0.3, y=0.5) → trunk net → [t₁, t₂, ..., t₁₂₈] (128 basis values)

  • output = b₁t₁ + b₂t₂ + ... + b₁₂₈*t₁₂₈ (single number)

To get output at multiple locations, call this function multiple times with different queryLocation values (branch only computed once!).

EvaluateMultiple(T[], T[,])

Evaluates the operator at multiple query locations efficiently.

public T[] EvaluateMultiple(T[] inputFunction, T[,] queryLocations)

Parameters

inputFunction T[]

Input function values at sensors.

queryLocations T[,]

Multiple query locations [numQueries, spatialDim].

Returns

T[]

Output values at all query locations [numQueries].

Remarks

For Beginners: This is more efficient than calling Evaluate() multiple times because:

  • Branch network is evaluated only once (not per query point)
  • Only trunk network is evaluated for each query location

This is a key advantage of DeepONet: once you encode the input function via the branch network, you can query the solution at many locations very cheaply (just trunk network evaluations).

GetGradients()

Gets the gradients from all layers in the neural network.

public override Vector<T> GetGradients()

Returns

Vector<T>

A vector containing all gradients from all layers concatenated together.

Remarks

This method collects the gradients from every layer in the network and combines them into a single vector. This is useful for optimization algorithms that need access to all gradients at once.

For Beginners: During training, each layer calculates how its parameters should change (the gradients). This method gathers all those gradients from every layer and puts them into one long list.

Think of it like:

  • Each layer has notes about how to improve (gradients)
  • This method collects all those notes into one document
  • The optimizer can then use this document to update the entire network

This is essential for the learning process, as it tells the optimizer how to adjust all the network's parameters to improve performance.

GetModelMetadata()

Gets metadata about the DeepONet model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

Model metadata.

GetParameters()

Gets the trainable parameters as a flattened vector.

public override Vector<T> GetParameters()

Returns

Vector<T>

InitializeLayers()

Initializes the layers of the neural network based on the architecture.

protected override void InitializeLayers()

Remarks

For Beginners: This method sets up all the layers in your neural network according to the architecture you've defined. It's like assembling the parts of your network before you can use it.

Predict(Tensor<T>)

Makes a prediction using the DeepONet for a batch of input/query pairs.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>

Tensor containing input function values followed by query locations.

Returns

Tensor<T>

Predicted output tensor.

SerializeNetworkSpecificData(BinaryWriter)

Serializes DeepONet-specific data.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter

Binary writer.

Train(Tensor<T>, Tensor<T>)

Performs a basic supervised training step using MSE loss.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>

Training input tensor.

expectedOutput Tensor<T>

Expected output tensor.

Remarks

Uses standard backpropagation through both branch and trunk networks.

Train(T[,], T[,,], T[,], int, double, bool)

Trains DeepONet on input-output function pairs.

public TrainingHistory<T> Train(T[,] inputFunctions, T[,,] queryLocations, T[,] targetValues, int epochs = 100, double learningRate = 0.001, bool verbose = true)

Parameters

inputFunctions T[,]

Training input functions [numSamples, numSensors].

queryLocations T[,,]

Query locations for each sample [numSamples, numQueries, spatialDim].

targetValues T[,]

Target output values [numSamples, numQueries].

epochs int

Number of training epochs.

learningRate double
verbose bool

Returns

TrainingHistory<T>

Training history.

Remarks

For Beginners: Training DeepONet involves:

  1. For each training example (input function, query locations, target outputs): a) Evaluate DeepONet at query locations b) Compute loss (MSE between predictions and targets) c) Backpropagate through both branch and trunk networks d) Update all parameters

The beauty is that both networks learn together:

  • Branch learns what features of the input matter
  • Trunk learns what spatial patterns exist
  • They coordinate through the shared latent space

Training Data Format:

  • inputFunctions[i]: Values at sensor locations for sample i
  • queryLocations[i]: Where to evaluate output for sample i
  • targetValues[i]: Ground truth outputs at those locations

You can use different query locations for each training sample! This flexibility is a key advantage of DeepONet.

UpdateParameters(Vector<T>)

Updates the branch and trunk network parameters from a flattened vector.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

Parameter vector.