Class BayesianNeuralNetwork<T>
- Assembly
- AiDotNet.dll
Implements a Bayesian Neural Network that provides uncertainty estimates with predictions.
public class BayesianNeuralNetwork<T> : NeuralNetwork<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable, IUncertaintyEstimator<T>
Type Parameters
TThe numeric type used for calculations (e.g., float, double).
- Inheritance
-
BayesianNeuralNetwork<T>
- Implements
- Inherited Members
- Extension Methods
Remarks
For Beginners: A Bayesian Neural Network (BNN) is a neural network that can tell you not just what it predicts, but also how uncertain it is about that prediction.
This is incredibly important for safety-critical applications like:
- Medical diagnosis: "This might be cancer, but I'm very uncertain - get a second opinion"
- Autonomous driving: "I'm not sure what that object is - proceed with caution"
- Financial predictions: "The market might go up, but there's high uncertainty"
The network achieves this by making multiple predictions with slightly different weights (sampled from learned probability distributions) and analyzing how much these predictions vary.
Constructors
BayesianNeuralNetwork(NeuralNetworkArchitecture<T>, int)
Initializes a new instance of the BayesianNeuralNetwork class.
public BayesianNeuralNetwork(NeuralNetworkArchitecture<T> architecture, int numSamples = 30)
Parameters
architectureNeuralNetworkArchitecture<T>The network architecture.
numSamplesintNumber of forward passes for uncertainty estimation (default: 30).
Remarks
For Beginners: The number of samples determines how many times we run the network with different weight samples to estimate uncertainty. More samples = better uncertainty estimates but slower inference. 30 is usually a good balance.
Properties
SupportsTraining
Indicates whether this network supports training (learning from data).
public override bool SupportsTraining { get; }
Property Value
Remarks
A neural network is considered trainable when at least one layer supports training.
Methods
ComputeKLDivergence()
Computes the total KL divergence from all Bayesian layers.
public T ComputeKLDivergence()
Returns
- T
The sum of KL divergences.
Remarks
For Beginners: This is used during training to regularize the weight distributions. It's added to the main loss to prevent the network from becoming overconfident.
EstimateAleatoricUncertainty(Tensor<T>)
Estimates aleatoric (data) uncertainty.
public Tensor<T> EstimateAleatoricUncertainty(Tensor<T> input)
Parameters
inputTensor<T>The input tensor.
Returns
- Tensor<T>
The aleatoric uncertainty estimate.
Remarks
For Beginners: Aleatoric uncertainty represents irreducible randomness in the data itself. For example, if you're predicting dice rolls, there's inherent randomness that can't be eliminated.
EstimateEpistemicUncertainty(Tensor<T>)
Estimates epistemic (model) uncertainty.
public Tensor<T> EstimateEpistemicUncertainty(Tensor<T> input)
Parameters
inputTensor<T>The input tensor.
Returns
- Tensor<T>
The epistemic uncertainty estimate.
Remarks
For Beginners: Epistemic uncertainty represents the model's lack of knowledge. This type of uncertainty can be reduced by collecting more training data. It's high when the model encounters inputs unlike anything it was trained on.
InitializeLayers()
Initializes the layers of the neural network based on the architecture.
protected override void InitializeLayers()
Remarks
This method sets up the neural network's structure by either: 1. Using custom layers provided in the architecture, or 2. Creating default layers if none were specified
The layers determine how data flows through the network and how computations are performed.
For Beginners: This method sets up the building blocks of your neural network.
Think of this as assembling the components of your network:
- If you've specified exactly what layers you want, those are used
- If not, standard layers are created based on your architecture settings
Layers are the key processing units in a neural network. Common types include:
- Input Layer: Receives your data
- Hidden Layers: Process the information, extracting patterns
- Output Layer: Produces the final prediction
Each layer contains neurons that apply mathematical operations and activation functions to transform the data as it flows through the network.
PredictWithUncertainty(Tensor<T>)
Predicts output with uncertainty estimates.
public UncertaintyPredictionResult<T, Tensor<T>> PredictWithUncertainty(Tensor<T> input)
Parameters
inputTensor<T>The input tensor.
Returns
- UncertaintyPredictionResult<T, Tensor<T>>
A prediction result augmented with uncertainty information.
Remarks
For Beginners: This method runs the network multiple times with different sampled weights and returns both the average prediction and how much the predictions varied.
Train(Tensor<T>, Tensor<T>)
Trains the neural network on input-output pairs.
public override void Train(Tensor<T> input, Tensor<T> expectedOutput)
Parameters
inputTensor<T>The input tensor for training.
expectedOutputTensor<T>The expected output tensor.
Remarks
This method performs one step of training on a single input-output pair or batch. It computes the forward pass, calculates the error, and backpropagates to update the network's parameters. For full training, this method should be called repeatedly with different inputs from the training dataset.
For Beginners: This method teaches the network to make better predictions.
The training process works like this:
- Input data is fed into the network
- The network makes a prediction (forward pass)
- The prediction is compared to the expected output to calculate error
- The error is propagated backward through the network (backpropagation)
- The network's parameters are adjusted to reduce the error
Think of it like learning from mistakes:
- The network makes a guess
- It sees how far off it was
- It adjusts its approach to do better next time
This method performs one iteration of this process. To fully train a network, you'd typically call this method many times with different examples from your training data.