Class DeepOperatorNetwork<T>
- Namespace
- AiDotNet.PhysicsInformed.NeuralOperators
- Assembly
- AiDotNet.dll
Implements Deep Operator Network (DeepONet) for learning operators.
public class DeepOperatorNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
DeepOperatorNetwork<T>
- Implements
- Inherited Members
- Extension Methods
Remarks
For Beginners: DeepONet is another approach to learning operators (like FNO), but with a different architecture.
Universal Approximation Theorem for Operators: Just as neural networks can approximate any function, DeepONet can approximate any operator! This is based on a theorem by Chen and Chen (1995).
The Key Idea - Decomposition: DeepONet represents an operator G as: G(u)(y) = Σᵢ bᵢ(u) * tᵢ(y)
Where:
- u is the input function
- y is the query location
- bᵢ(u) are "basis functions" of the input (learned by Branch Net)
- tᵢ(y) are "basis functions" of the location (learned by Trunk Net)
Architecture: DeepONet has TWO networks:
Branch Network:
- Input: The entire input function u(x) (sampled at sensors)
- Output: Coefficients b₁, b₂, ..., bₚ
- Role: Encodes information about the input function
Trunk Network:
- Input: Query location y (where we want to evaluate output)
- Output: Basis functions t₁(y), t₂(y), ..., tₚ(y)
- Role: Encodes spatial/temporal patterns
Combination:
- Output: G(u)(y) = b · t = Σᵢ bᵢ * tᵢ(y)
- Simple dot product of the two network outputs
Analogy: Think of it like a bilinear form or low-rank factorization:
- Branch net learns "what" information matters in the input
- Trunk net learns "where" patterns occur spatially
- Their interaction gives the output
Example - Heat Equation: Problem: Given initial temperature u(x,0), find temperature u(x,t)
Branch Net:
- Input: u(x,0) sampled at many points → [u(x₁,0), u(x₂,0), ..., u(xₙ,0)]
- Learns: "This initial condition is smooth/peaked/oscillatory"
- Output: Coefficients [b₁, b₂, ..., bₚ]
Trunk Net:
- Input: (x, t) where we want to know the temperature
- Learns: Spatial-temporal basis functions
- Output: Basis values [t₁(x,t), t₂(x,t), ..., tₚ(x,t)]
Result: u(x,t) = Σᵢ bᵢ * tᵢ(x,t)
Key Advantages:
- Sensor flexibility: Can use different sensor locations at test time
- Query flexibility: Can evaluate at any location y
- Theoretical foundation: Universal approximation theorem
- Efficient: Once trained, very fast evaluation
- Interpretable: Decomposition into branch/trunk has clear meaning
Comparison with FNO: DeepONet:
- Works on unstructured data (any sensor locations)
- More flexible for irregular domains
- Requires specifying sensor locations
- Good for problems with sparse/irregular data
FNO:
- Works on structured grids
- Uses FFT (very efficient)
- Resolution-invariant
- Good for periodic/regular problems
Both are powerful, choice depends on your problem!
Applications:
- Same as FNO: PDEs, climate, fluids, etc.
- Particularly good for:
- Inverse problems (finding unknown parameters)
- Problems with sparse measurements
- Irregular geometries
- Multi-scale phenomena
Historical Note: DeepONet was introduced by Lu et al. (2021) and has been highly successful in learning solution operators for PDEs with theoretical guarantees.
Constructors
DeepOperatorNetwork(NeuralNetworkArchitecture<T>, NeuralNetworkArchitecture<T>, NeuralNetworkArchitecture<T>, int, int, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?)
Initializes a new instance of DeepONet.
public DeepOperatorNetwork(NeuralNetworkArchitecture<T> architecture, NeuralNetworkArchitecture<T> branchArchitecture, NeuralNetworkArchitecture<T> trunkArchitecture, int latentDimension = 128, int numSensors = 100, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? optimizer = null)
Parameters
architectureNeuralNetworkArchitecture<T>The overall architecture (mainly for metadata).
branchArchitectureNeuralNetworkArchitecture<T>Architecture for the branch network.
trunkArchitectureNeuralNetworkArchitecture<T>Architecture for the trunk network.
latentDimensionintDimension p of the latent space (number of basis functions).
numSensorsintNumber of sensor locations where input function is sampled.
optimizerIGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>
Remarks
For Beginners:
Parameters:
latentDimension (p): Number of basis functions
- Controls the expressiveness of the operator
- Higher p = more expressive but more parameters
- Typical: 100-400
- Like the rank in low-rank matrix factorization
numSensors: How many points to sample the input function
- More sensors = more information about input
- Must be enough to capture important features
- Typical: 50-200
- Can use different sensor locations at train vs. test time!
Branch Network:
- Input size: numSensors (values of input function at sensors)
- Output size: latentDimension (p)
- Architecture: Deep feedforward network
- Typical: 3-5 layers, 100-200 neurons per layer
Trunk Network:
- Input size: dimension of query location (e.g., 2 for (x,y), 3 for (x,y,t))
- Output size: latentDimension (p)
- Architecture: Deep feedforward network
- Typical: 3-5 layers, 100-200 neurons per layer
Properties
ParameterCount
Gets the total number of parameters across branch and trunk networks.
public override int ParameterCount { get; }
Property Value
SupportsJitCompilation
Gets whether this model currently supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if the model can be JIT compiled, false otherwise.
Remarks
Some models may not support JIT compilation due to: - Dynamic graph structure (changes based on input) - Lack of computation graph representation - Use of operations not yet supported by the JIT compiler
For Beginners: This tells you whether this specific model can benefit from JIT compilation.
Models return false if they:
- Use layer-based architecture without graph export (e.g., current neural networks)
- Have control flow that changes based on input data
- Use operations the JIT compiler doesn't understand yet
In these cases, the model will still work normally, just without JIT acceleration.
SupportsTraining
Indicates whether this network supports training (learning from data).
public override bool SupportsTraining { get; }
Property Value
Remarks
For Beginners: Not all neural networks can learn. Some are designed only for making predictions with pre-set parameters. This property tells you if the network can learn from data.
Methods
CreateNewInstance()
Creates a new instance with the same configuration.
protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()
Returns
- IFullModel<T, Tensor<T>, Tensor<T>>
New DeepONet instance.
DeserializeNetworkSpecificData(BinaryReader)
Deserializes DeepONet-specific data.
protected override void DeserializeNetworkSpecificData(BinaryReader reader)
Parameters
readerBinaryReaderBinary reader.
Evaluate(T[], T[])
Evaluates the operator: G(u)(y) = branch(u) · trunk(y).
public T Evaluate(T[] inputFunction, T[] queryLocation)
Parameters
inputFunctionT[]Values of input function at sensor locations [numSensors].
queryLocationT[]Location where to evaluate output [spatialDim].
Returns
- T
Output value at the query location.
Remarks
For Beginners: This is the forward pass of DeepONet.
Steps:
- Pass input function values through branch network → get coefficients b
- Pass query location through trunk network → get basis functions t
- Compute dot product: output = b · t
Example:
inputFunction = [0.5, 0.7, 0.3, ...] (100 values) → branch net → [b₁, b₂, ..., b₁₂₈] (128 coefficients)
queryLocation = [0.3, 0.5] (x=0.3, y=0.5) → trunk net → [t₁, t₂, ..., t₁₂₈] (128 basis values)
output = b₁t₁ + b₂t₂ + ... + b₁₂₈*t₁₂₈ (single number)
To get output at multiple locations, call this function multiple times with different queryLocation values (branch only computed once!).
EvaluateMultiple(T[], T[,])
Evaluates the operator at multiple query locations efficiently.
public T[] EvaluateMultiple(T[] inputFunction, T[,] queryLocations)
Parameters
inputFunctionT[]Input function values at sensors.
queryLocationsT[,]Multiple query locations [numQueries, spatialDim].
Returns
- T[]
Output values at all query locations [numQueries].
Remarks
For Beginners: This is more efficient than calling Evaluate() multiple times because:
- Branch network is evaluated only once (not per query point)
- Only trunk network is evaluated for each query location
This is a key advantage of DeepONet: once you encode the input function via the branch network, you can query the solution at many locations very cheaply (just trunk network evaluations).
GetGradients()
Gets the gradients from all layers in the neural network.
public override Vector<T> GetGradients()
Returns
- Vector<T>
A vector containing all gradients from all layers concatenated together.
Remarks
This method collects the gradients from every layer in the network and combines them into a single vector. This is useful for optimization algorithms that need access to all gradients at once.
For Beginners: During training, each layer calculates how its parameters should change (the gradients). This method gathers all those gradients from every layer and puts them into one long list.
Think of it like:
- Each layer has notes about how to improve (gradients)
- This method collects all those notes into one document
- The optimizer can then use this document to update the entire network
This is essential for the learning process, as it tells the optimizer how to adjust all the network's parameters to improve performance.
GetModelMetadata()
Gets metadata about the DeepONet model.
public override ModelMetadata<T> GetModelMetadata()
Returns
- ModelMetadata<T>
Model metadata.
GetParameters()
Gets the trainable parameters as a flattened vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
InitializeLayers()
Initializes the layers of the neural network based on the architecture.
protected override void InitializeLayers()
Remarks
For Beginners: This method sets up all the layers in your neural network according to the architecture you've defined. It's like assembling the parts of your network before you can use it.
Predict(Tensor<T>)
Makes a prediction using the DeepONet for a batch of input/query pairs.
public override Tensor<T> Predict(Tensor<T> input)
Parameters
inputTensor<T>Tensor containing input function values followed by query locations.
Returns
- Tensor<T>
Predicted output tensor.
SerializeNetworkSpecificData(BinaryWriter)
Serializes DeepONet-specific data.
protected override void SerializeNetworkSpecificData(BinaryWriter writer)
Parameters
writerBinaryWriterBinary writer.
Train(Tensor<T>, Tensor<T>)
Performs a basic supervised training step using MSE loss.
public override void Train(Tensor<T> input, Tensor<T> expectedOutput)
Parameters
inputTensor<T>Training input tensor.
expectedOutputTensor<T>Expected output tensor.
Remarks
Uses standard backpropagation through both branch and trunk networks.
Train(T[,], T[,,], T[,], int, double, bool)
Trains DeepONet on input-output function pairs.
public TrainingHistory<T> Train(T[,] inputFunctions, T[,,] queryLocations, T[,] targetValues, int epochs = 100, double learningRate = 0.001, bool verbose = true)
Parameters
inputFunctionsT[,]Training input functions [numSamples, numSensors].
queryLocationsT[,,]Query locations for each sample [numSamples, numQueries, spatialDim].
targetValuesT[,]Target output values [numSamples, numQueries].
epochsintNumber of training epochs.
learningRatedoubleverbosebool
Returns
- TrainingHistory<T>
Training history.
Remarks
For Beginners: Training DeepONet involves:
- For each training example (input function, query locations, target outputs): a) Evaluate DeepONet at query locations b) Compute loss (MSE between predictions and targets) c) Backpropagate through both branch and trunk networks d) Update all parameters
The beauty is that both networks learn together:
- Branch learns what features of the input matter
- Trunk learns what spatial patterns exist
- They coordinate through the shared latent space
Training Data Format:
- inputFunctions[i]: Values at sensor locations for sample i
- queryLocations[i]: Where to evaluate output for sample i
- targetValues[i]: Ground truth outputs at those locations
You can use different query locations for each training sample! This flexibility is a key advantage of DeepONet.
UpdateParameters(Vector<T>)
Updates the branch and trunk network parameters from a flattened vector.
public override void UpdateParameters(Vector<T> parameters)
Parameters
parametersVector<T>Parameter vector.