Class DeepOperatorNetwork<T>

Namespace: AiDotNet.PhysicsInformed.NeuralOperators

Assembly: AiDotNet.dll

Implements Deep Operator Network (DeepONet) for learning operators.

public class DeepOperatorNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

NeuralNetworkBase<T>

DeepOperatorNetwork<T>

Implements: INeuralNetworkModel<T>

INeuralNetwork<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

IInterpretableModel<T>

IInputGradientComputable<T>

IDisposable

Inherited Members: NeuralNetworkBase<T>.Layers

NeuralNetworkBase<T>.LayerCount

NeuralNetworkBase<T>.Architecture

NeuralNetworkBase<T>.NumOps

NeuralNetworkBase<T>.Engine

NeuralNetworkBase<T>._layerInputs

NeuralNetworkBase<T>._layerOutputs

NeuralNetworkBase<T>.Random

NeuralNetworkBase<T>.LossFunction

NeuralNetworkBase<T>.LastLoss

NeuralNetworkBase<T>.IsTrainingMode

NeuralNetworkBase<T>.SupportsGpuTraining

NeuralNetworkBase<T>.CanTrainOnGpu

NeuralNetworkBase<T>.GpuEngine

NeuralNetworkBase<T>.MaxGradNorm

NeuralNetworkBase<T>._mixedPrecisionContext

NeuralNetworkBase<T>._memoryManager

NeuralNetworkBase<T>.IsMemoryManagementEnabled

NeuralNetworkBase<T>.IsGradientCheckpointingEnabled

NeuralNetworkBase<T>.IsMixedPrecisionEnabled

NeuralNetworkBase<T>.ClipGradients(List<Tensor<T>>)

NeuralNetworkBase<T>.ClipGradient(Tensor<T>)

NeuralNetworkBase<T>.ClipGradient(Vector<T>)

NeuralNetworkBase<T>.Backpropagate(Tensor<T>)

NeuralNetworkBase<T>.BackpropagateWithRecompute(Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpuDeferred(IGpuTensor<T>, GpuExecutionOptions)

NeuralNetworkBase<T>.UpdateParametersGpu(T, T, T)

NeuralNetworkBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

NeuralNetworkBase<T>.UpdateParametersGpuDeferred(IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferred(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferredAsync(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions, CancellationToken)

NeuralNetworkBase<T>.UploadWeightsToGpu()

NeuralNetworkBase<T>.DownloadWeightsFromGpu()

NeuralNetworkBase<T>.ZeroGradientsGpu()

NeuralNetworkBase<T>.ExtractSingleExample(Tensor<T>, int)

NeuralNetworkBase<T>.ForwardWithMemory(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithCheckpointing(Tensor<T>)

NeuralNetworkBase<T>.CanUseGpuResidentPath()

NeuralNetworkBase<T>.TryForwardGpuOptimized(Tensor<T>, out Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferred(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferredAsync(Tensor<T>, CancellationToken)

NeuralNetworkBase<T>.BeginGpuExecution(GpuExecutionOptions)

NeuralNetworkBase<T>.ForwardWithGpuContext(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithGpuContext(IGpuTensor<T>)

NeuralNetworkBase<T>.GetGpuMemoryStats()

NeuralNetworkBase<T>.ForwardWithFeatures(Tensor<T>, int[])

NeuralNetworkBase<T>.GetParameterCount()

NeuralNetworkBase<T>.InvalidateParameterCountCache()

NeuralNetworkBase<T>.AddLayerToCollection(ILayer<T>)

NeuralNetworkBase<T>.RemoveLayerFromCollection(ILayer<T>)

NeuralNetworkBase<T>.ClearLayers()

NeuralNetworkBase<T>.ValidateCustomLayers(List<ILayer<T>>)

NeuralNetworkBase<T>.ValidateCustomLayersInternal(List<ILayer<T>>)

NeuralNetworkBase<T>.IsValidInputLayer(ILayer<T>)

NeuralNetworkBase<T>.IsValidOutputLayer(ILayer<T>)

NeuralNetworkBase<T>.AreLayersCompatible(ILayer<T>, ILayer<T>)

NeuralNetworkBase<T>.GetParameterGradients()

NeuralNetworkBase<T>.EnsureArchitectureInitialized()

NeuralNetworkBase<T>.SetTrainingMode(bool)

NeuralNetworkBase<T>.EnableMemoryManagement(TrainingMemoryConfig)

NeuralNetworkBase<T>.DisableMemoryManagement()

NeuralNetworkBase<T>.GetMemoryEstimate(int, int)

NeuralNetworkBase<T>.GetLastLoss()

NeuralNetworkBase<T>.ResetState()

NeuralNetworkBase<T>.BackwardWithInputGradient(Tensor<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Vector<T>, Vector<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.SaveModel(string)

NeuralNetworkBase<T>.LoadModel(string)

NeuralNetworkBase<T>.Serialize()

NeuralNetworkBase<T>.Deserialize(byte[])

NeuralNetworkBase<T>.WithParameters(Vector<T>)

NeuralNetworkBase<T>.GetActiveFeatureIndices()

NeuralNetworkBase<T>.IsFeatureUsed(int)

NeuralNetworkBase<T>.DeepCopy()

NeuralNetworkBase<T>.Clone()

NeuralNetworkBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

NeuralNetworkBase<T>._enabledMethods

NeuralNetworkBase<T>._sensitiveFeatures

NeuralNetworkBase<T>._fairnessMetrics

NeuralNetworkBase<T>._baseModel

NeuralNetworkBase<T>.GetGlobalFeatureImportanceAsync()

NeuralNetworkBase<T>.GetLocalFeatureImportanceAsync(Tensor<T>)

NeuralNetworkBase<T>.GetShapValuesAsync(Tensor<T>)

NeuralNetworkBase<T>.GetLimeExplanationAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetPartialDependenceAsync(Vector<int>, int)

NeuralNetworkBase<T>.GetCounterfactualAsync(Tensor<T>, Tensor<T>, int)

NeuralNetworkBase<T>.GetModelSpecificInterpretabilityAsync()

NeuralNetworkBase<T>.GenerateTextExplanationAsync(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.GetFeatureInteractionAsync(int, int)

NeuralNetworkBase<T>.ValidateFairnessAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetAnchorExplanationAsync(Tensor<T>, T)

NeuralNetworkBase<T>.SetBaseModel<TInput, TOutput>(IFullModel<T, TInput, TOutput>)

NeuralNetworkBase<T>.EnableMethod(params InterpretationMethod[])

NeuralNetworkBase<T>.ConfigureFairness(Vector<int>, params FairnessMetric[])

NeuralNetworkBase<T>.GetNamedLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.GetArchitecture()

NeuralNetworkBase<T>.GetFeatureImportance()

NeuralNetworkBase<T>.SetParameters(Vector<T>)

NeuralNetworkBase<T>.AddLayer(LayerType, int, ActivationFunction)

NeuralNetworkBase<T>.AddConvolutionalLayer(int, int, int, ActivationFunction)

NeuralNetworkBase<T>.AddLSTMLayer(int, bool)

NeuralNetworkBase<T>.AddDropoutLayer(double)

NeuralNetworkBase<T>.AddBatchNormalizationLayer(int, double, double)

NeuralNetworkBase<T>.AddPoolingLayer(int[], PoolingType, int, int?)

NeuralNetworkBase<T>.GetInputShape()

NeuralNetworkBase<T>.GetLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.DefaultLossFunction

NeuralNetworkBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

NeuralNetworkBase<T>.ApplyGradients(Vector<T>, T)

NeuralNetworkBase<T>.SaveState(Stream)

NeuralNetworkBase<T>.LoadState(Stream)

NeuralNetworkBase<T>.Dispose()

NeuralNetworkBase<T>.Dispose(bool)

NeuralNetworkBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

NeuralNetworkBase<T>.ConvertLayerToGraph(ILayer<T>, ComputationNode<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

For Beginners: DeepONet is another approach to learning operators (like FNO), but with a different architecture.

Universal Approximation Theorem for Operators: Just as neural networks can approximate any function, DeepONet can approximate any operator! This is based on a theorem by Chen and Chen (1995).

The Key Idea - Decomposition: DeepONet represents an operator G as: G(u)(y) = Σᵢ bᵢ(u) * tᵢ(y)

Where:

u is the input function
y is the query location
bᵢ(u) are "basis functions" of the input (learned by Branch Net)
tᵢ(y) are "basis functions" of the location (learned by Trunk Net)

Architecture: DeepONet has TWO networks:

Branch Network:
- Input: The entire input function u(x) (sampled at sensors)
- Output: Coefficients b₁, b₂, ..., bₚ
- Role: Encodes information about the input function
Trunk Network:
- Input: Query location y (where we want to evaluate output)
- Output: Basis functions t₁(y), t₂(y), ..., tₚ(y)
- Role: Encodes spatial/temporal patterns
Combination:
- Output: G(u)(y) = b · t = Σᵢ bᵢ * tᵢ(y)
- Simple dot product of the two network outputs

Analogy: Think of it like a bilinear form or low-rank factorization:

Branch net learns "what" information matters in the input
Trunk net learns "where" patterns occur spatially
Their interaction gives the output

Example - Heat Equation: Problem: Given initial temperature u(x,0), find temperature u(x,t)

Branch Net:

Input: u(x,0) sampled at many points → [u(x₁,0), u(x₂,0), ..., u(xₙ,0)]
Learns: "This initial condition is smooth/peaked/oscillatory"
Output: Coefficients [b₁, b₂, ..., bₚ]

Trunk Net:

Input: (x, t) where we want to know the temperature
Learns: Spatial-temporal basis functions
Output: Basis values [t₁(x,t), t₂(x,t), ..., tₚ(x,t)]

Result: u(x,t) = Σᵢ bᵢ * tᵢ(x,t)

Key Advantages:

Sensor flexibility: Can use different sensor locations at test time
Query flexibility: Can evaluate at any location y
Theoretical foundation: Universal approximation theorem
Efficient: Once trained, very fast evaluation
Interpretable: Decomposition into branch/trunk has clear meaning

Comparison with FNO: DeepONet:

Works on unstructured data (any sensor locations)
More flexible for irregular domains
Requires specifying sensor locations
Good for problems with sparse/irregular data

FNO:

Works on structured grids
Uses FFT (very efficient)
Resolution-invariant
Good for periodic/regular problems

Both are powerful, choice depends on your problem!

Applications:

Same as FNO: PDEs, climate, fluids, etc.
Particularly good for:
- Inverse problems (finding unknown parameters)
- Problems with sparse measurements
- Irregular geometries
- Multi-scale phenomena

Historical Note: DeepONet was introduced by Lu et al. (2021) and has been highly successful in learning solution operators for PDEs with theoretical guarantees.

Constructors

DeepOperatorNetwork(NeuralNetworkArchitecture<T>, NeuralNetworkArchitecture<T>, NeuralNetworkArchitecture<T>, int, int, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?)

Initializes a new instance of DeepONet.

public DeepOperatorNetwork(NeuralNetworkArchitecture<T> architecture, NeuralNetworkArchitecture<T> branchArchitecture, NeuralNetworkArchitecture<T> trunkArchitecture, int latentDimension = 128, int numSensors = 100, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? optimizer = null)

Parameters

architecture NeuralNetworkArchitecture<T>: The overall architecture (mainly for metadata).
branchArchitecture NeuralNetworkArchitecture<T>: Architecture for the branch network.
trunkArchitecture NeuralNetworkArchitecture<T>: Architecture for the trunk network.
latentDimension int: Dimension p of the latent space (number of basis functions).
numSensors int: Number of sensor locations where input function is sampled.
optimizer IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>

Remarks

For Beginners:

Parameters:

latentDimension (p): Number of basis functions

Controls the expressiveness of the operator
Higher p = more expressive but more parameters
Typical: 100-400
Like the rank in low-rank matrix factorization

numSensors: How many points to sample the input function

More sensors = more information about input
Must be enough to capture important features
Typical: 50-200
Can use different sensor locations at train vs. test time!

Branch Network:

Input size: numSensors (values of input function at sensors)
Output size: latentDimension (p)
Architecture: Deep feedforward network
Typical: 3-5 layers, 100-200 neurons per layer

Trunk Network:

Input size: dimension of query location (e.g., 2 for (x,y), 3 for (x,y,t))
Output size: latentDimension (p)
Architecture: Deep feedforward network
Typical: 3-5 layers, 100-200 neurons per layer

Properties

ParameterCount

Gets the total number of parameters across branch and trunk networks.

public override int ParameterCount { get; }

Property Value

int

SupportsJitCompilation

Gets whether this model currently supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool: True if the model can be JIT compiled, false otherwise.

Remarks

Some models may not support JIT compilation due to: - Dynamic graph structure (changes based on input) - Lack of computation graph representation - Use of operations not yet supported by the JIT compiler

For Beginners: This tells you whether this specific model can benefit from JIT compilation.

Models return false if they:

Use layer-based architecture without graph export (e.g., current neural networks)
Have control flow that changes based on input data
Use operations the JIT compiler doesn't understand yet

In these cases, the model will still work normally, just without JIT acceleration.

SupportsTraining

Indicates whether this network supports training (learning from data).

public override bool SupportsTraining { get; }

Property Value

bool

Remarks

For Beginners: Not all neural networks can learn. Some are designed only for making predictions with pre-set parameters. This property tells you if the network can learn from data.

Methods

CreateNewInstance()

Creates a new instance with the same configuration.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>: New DeepONet instance.

DeserializeNetworkSpecificData(BinaryReader)

Deserializes DeepONet-specific data.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader: Binary reader.

Evaluate(T[], T[])

Evaluates the operator: G(u)(y) = branch(u) · trunk(y).

public T Evaluate(T[] inputFunction, T[] queryLocation)

Parameters

inputFunction T[]: Values of input function at sensor locations [numSensors].
queryLocation T[]: Location where to evaluate output [spatialDim].

Returns

T: Output value at the query location.

Remarks

For Beginners: This is the forward pass of DeepONet.

Steps:

Pass input function values through branch network → get coefficients b
Pass query location through trunk network → get basis functions t
Compute dot product: output = b · t

Example:

inputFunction = [0.5, 0.7, 0.3, ...] (100 values) → branch net → [b₁, b₂, ..., b₁₂₈] (128 coefficients)
queryLocation = [0.3, 0.5] (x=0.3, y=0.5) → trunk net → [t₁, t₂, ..., t₁₂₈] (128 basis values)
output = b₁t₁ + b₂t₂ + ... + b₁₂₈*t₁₂₈ (single number)

To get output at multiple locations, call this function multiple times with different queryLocation values (branch only computed once!).

EvaluateMultiple(T[], T[,])

Evaluates the operator at multiple query locations efficiently.

public T[] EvaluateMultiple(T[] inputFunction, T[,] queryLocations)

Parameters

inputFunction T[]: Input function values at sensors.
queryLocations T[,]: Multiple query locations [numQueries, spatialDim].

Returns

T[]: Output values at all query locations [numQueries].

Remarks

For Beginners: This is more efficient than calling Evaluate() multiple times because:

Branch network is evaluated only once (not per query point)
Only trunk network is evaluated for each query location

This is a key advantage of DeepONet: once you encode the input function via the branch network, you can query the solution at many locations very cheaply (just trunk network evaluations).

GetGradients()

Gets the gradients from all layers in the neural network.

public override Vector<T> GetGradients()

Returns

Vector<T>: A vector containing all gradients from all layers concatenated together.

Remarks

This method collects the gradients from every layer in the network and combines them into a single vector. This is useful for optimization algorithms that need access to all gradients at once.

For Beginners: During training, each layer calculates how its parameters should change (the gradients). This method gathers all those gradients from every layer and puts them into one long list.

Think of it like:

Each layer has notes about how to improve (gradients)
This method collects all those notes into one document
The optimizer can then use this document to update the entire network

This is essential for the learning process, as it tells the optimizer how to adjust all the network's parameters to improve performance.

GetModelMetadata()

Gets metadata about the DeepONet model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>: Model metadata.

GetParameters()

Gets the trainable parameters as a flattened vector.

public override Vector<T> GetParameters()

Returns

Vector<T>

InitializeLayers()

Initializes the layers of the neural network based on the architecture.

protected override void InitializeLayers()

Remarks

For Beginners: This method sets up all the layers in your neural network according to the architecture you've defined. It's like assembling the parts of your network before you can use it.

Predict(Tensor<T>)

Makes a prediction using the DeepONet for a batch of input/query pairs.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>: Tensor containing input function values followed by query locations.

Returns

Tensor<T>: Predicted output tensor.

SerializeNetworkSpecificData(BinaryWriter)

Serializes DeepONet-specific data.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter: Binary writer.

Train(Tensor<T>, Tensor<T>)

Performs a basic supervised training step using MSE loss.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>: Training input tensor.
expectedOutput Tensor<T>: Expected output tensor.

Remarks

Uses standard backpropagation through both branch and trunk networks.

Train(T[,], T[,,], T[,], int, double, bool)

Trains DeepONet on input-output function pairs.

public TrainingHistory<T> Train(T[,] inputFunctions, T[,,] queryLocations, T[,] targetValues, int epochs = 100, double learningRate = 0.001, bool verbose = true)

Parameters

inputFunctions T[,]: Training input functions [numSamples, numSensors].
queryLocations T[,,]: Query locations for each sample [numSamples, numQueries, spatialDim].
targetValues T[,]: Target output values [numSamples, numQueries].
epochs int: Number of training epochs.
learningRate double
verbose bool

Returns

TrainingHistory<T>: Training history.

Remarks

For Beginners: Training DeepONet involves:

For each training example (input function, query locations, target outputs): a) Evaluate DeepONet at query locations b) Compute loss (MSE between predictions and targets) c) Backpropagate through both branch and trunk networks d) Update all parameters

The beauty is that both networks learn together:

Branch learns what features of the input matter
Trunk learns what spatial patterns exist
They coordinate through the shared latent space

Training Data Format:

inputFunctions[i]: Values at sensor locations for sample i
queryLocations[i]: Where to evaluate output for sample i
targetValues[i]: Ground truth outputs at those locations

You can use different query locations for each training sample! This flexibility is a key advantage of DeepONet.

UpdateParameters(Vector<T>)

Updates the branch and trunk network parameters from a flattened vector.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: Parameter vector.

Table of Contents

Class DeepOperatorNetwork<T>

Type Parameters

Remarks

Constructors

DeepOperatorNetwork(NeuralNetworkArchitecture<T>, NeuralNetworkArchitecture<T>, NeuralNetworkArchitecture<T>, int, int, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?)

Parameters

Remarks

Properties

ParameterCount

Property Value

SupportsJitCompilation

Property Value

Remarks

SupportsTraining

Property Value

Remarks

Methods

CreateNewInstance()

Returns

DeserializeNetworkSpecificData(BinaryReader)

Parameters

Evaluate(T[], T[])

Parameters

Returns

Remarks

EvaluateMultiple(T[], T[,])

Parameters

Returns

Remarks

GetGradients()

Returns

Remarks

GetModelMetadata()

Returns

GetParameters()

Returns

InitializeLayers()

Remarks

Predict(Tensor<T>)

Parameters

Returns

SerializeNetworkSpecificData(BinaryWriter)

Parameters

Train(Tensor<T>, Tensor<T>)

Parameters

Remarks

Train(T[,], T[,,], T[,], int, double, bool)

Parameters

Returns

Remarks

UpdateParameters(Vector<T>)

Parameters