Class DeepBeliefNetwork<T>
- Namespace
- AiDotNet.NeuralNetworks
- Assembly
- AiDotNet.dll
Represents a Deep Belief Network, a generative graphical model composed of multiple layers of Restricted Boltzmann Machines.
public class DeepBeliefNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
DeepBeliefNetwork<T>
- Implements
- Inherited Members
- Extension Methods
Remarks
A Deep Belief Network (DBN) is a probabilistic, generative model composed of multiple layers of stochastic latent variables. It is built by stacking multiple Restricted Boltzmann Machines (RBMs), where each RBM's hidden layer serves as the input layer for the next RBM. DBNs are trained using a two-phase approach: an unsupervised pre-training phase followed by a supervised fine-tuning phase. This allows them to learn complex patterns in data even with limited labeled examples.
For Beginners: A Deep Belief Network is like a tower of pattern-recognizing layers.
Imagine building a tower where:
- Each floor of the tower is a Restricted Boltzmann Machine (RBM)
- The bottom floor learns simple patterns from the raw data
- Each higher floor learns more complex patterns based on what the floor below it discovered
- The tower is built and trained one floor at a time, from bottom to top
For example, if analyzing images of faces:
- The first floor might learn to detect edges and basic shapes
- The middle floors might recognize features like eyes, noses, and mouths
- The top floors might identify complete facial expressions or identities
This layer-by-layer approach helps the network discover meaningful patterns even when you don't have a lot of labeled examples.
Constructors
DeepBeliefNetwork(NeuralNetworkArchitecture<T>, T?, int, int, ILossFunction<T>?)
Initializes a new instance of the DeepBeliefNetwork<T> class with the specified architecture.
public DeepBeliefNetwork(NeuralNetworkArchitecture<T> architecture, T? learningRate = default, int epochs = 10, int batchSize = 32, ILossFunction<T>? lossFunction = null)
Parameters
architectureNeuralNetworkArchitecture<T>The neural network architecture configuration, which must include RBM layers.
learningRateTThe learning rate for fine-tuning. Default is 0.01 converted to type T.
epochsintThe number of epochs for fine-tuning. Default is 10.
batchSizeintThe batch size for training. Default is 32.
lossFunctionILossFunction<T>
Remarks
This constructor creates a new Deep Belief Network with the specified architecture. The architecture must include a collection of Restricted Boltzmann Machines (RBMs) that will form the pre-training layers of the network. The constructor initializes the network by setting up these RBM layers.
For Beginners: This creates a new Deep Belief Network with your chosen settings.
When creating a Deep Belief Network:
- You provide an "architecture" that defines how the network is structured
- The architecture must include a set of RBM layers (the floors of our tower)
- The constructor sets up the initial structure, but doesn't train the network yet
Think of it like designing a blueprint for the tower before construction begins.
Properties
SupportsTraining
Indicates whether the network supports training (learning from data).
public override bool SupportsTraining { get; }
Property Value
Remarks
For Beginners: This tells you if the network can learn from data.
The Deep Belief Network supports both:
- Unsupervised pre-training (learning patterns without labels)
- Supervised fine-tuning (improving performance for specific tasks)
This property always returns true because DBNs are designed to learn and improve with training.
Methods
CreateNewInstance()
Creates a new instance of the deep belief network model.
protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()
Returns
- IFullModel<T, Tensor<T>, Tensor<T>>
A new instance of the deep belief network model with the same configuration.
Remarks
This method creates a new instance of the deep belief network model with the same configuration as the current instance. It is used internally during serialization/deserialization processes to create a fresh instance that can be populated with the serialized data. The new instance will have the same architecture, learning rate, epochs, batch size, and loss function as the original.
For Beginners: This method creates a copy of the network structure without copying the learned data.
Think of it like making a blueprint copy of the tower:
- It copies the same overall design (architecture)
- It preserves settings like learning rate and batch size
- It maintains the same RBM layer structure
- But it doesn't copy any of the learned patterns and weights
This is primarily used when saving or loading models, creating an empty framework that the saved parameters can be loaded into later.
DeserializeNetworkSpecificData(BinaryReader)
Deserializes network-specific data for the Deep Belief Network.
protected override void DeserializeNetworkSpecificData(BinaryReader reader)
Parameters
readerBinaryReaderThe BinaryReader to read the data from.
Remarks
This method reads the specific configuration and state of the Deep Belief Network from a binary stream. It reconstructs the network's structure, including RBM layers and training parameters, to match the state of the network when it was serialized.
For Beginners: This method loads the unique settings of your Deep Belief Network.
It reads:
- The number of RBM layers
- The configuration of each RBM layer
- Training parameters like learning rate, epochs, and batch size
Loading these details allows you to recreate the exact same network structure and state that was previously saved.
GetModelMetadata()
Gets metadata about the Deep Belief Network model.
public override ModelMetadata<T> GetModelMetadata()
Returns
- ModelMetadata<T>
A ModelMetaData object containing information about the model.
Remarks
This method returns metadata that describes the Deep Belief Network, including its type, architecture details, and training parameters. This information can be useful for model management, documentation, and versioning.
For Beginners: This provides a summary of your network's configuration.
The metadata includes:
- The type of model (Deep Belief Network)
- The number of RBM layers in the network
- The size of each layer
- Training parameters like learning rate and epochs
This is useful for:
- Documenting your model
- Comparing different model configurations
- Reproducing your model setup later
InitializeLayers()
Initializes the layers of the Deep Belief Network based on the architecture.
protected override void InitializeLayers()
Remarks
This method sets up the layers of the Deep Belief Network. If custom layers are provided in the architecture, those layers are used. Otherwise, default layers are created based on the architecture's specifications. After setting up the regular layers, the method validates the RBM layers to ensure they have compatible dimensions and are properly configured.
For Beginners: This method builds the actual structure of the network.
When initializing the layers:
- If you've specified your own custom layers, the network will use those
- If not, the network will create a standard set of layers
- The method also checks that the RBM layers (our tower floors) are properly designed
- Each floor must connect properly to the floors above and below it
This is like making sure all the pieces of your tower will fit together properly before you start building it.
Predict(Tensor<T>)
Makes a prediction using the current state of the Deep Belief Network.
public override Tensor<T> Predict(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to make a prediction for.
Returns
- Tensor<T>
The predicted output tensor after passing through all layers of the network.
Remarks
This method generates a prediction by passing the input tensor through each layer of the Deep Belief Network in sequence. Each layer processes the output of the previous layer, transforming the data until it reaches the final output layer. The result is a tensor representing the network's prediction.
For Beginners: This method uses the network to make a prediction based on input data.
The prediction process works like this:
- The input data enters the first layer of the network
- Each layer processes the data and passes it to the next layer
- The data is transformed as it flows up through the tower
- The final layer produces the prediction result
Once the network is trained, this is how you use it to recognize patterns, classify new data, or make predictions.
PretrainGreedyLayerwise(Tensor<T>, int, T?, int)
Performs unsupervised pre-training of the DBN using greedy layer-wise approach.
public void PretrainGreedyLayerwise(Tensor<T> trainingData, int pretrainingEpochs = 10, T? pretrainingLearningRate = default, int cdSteps = 1)
Parameters
trainingDataTensor<T>The training data tensor.
pretrainingEpochsintThe number of epochs for pre-training each RBM layer. Default is 10.
pretrainingLearningRateTThe learning rate for pre-training. Default is 0.1 converted to type T.
cdStepsintThe number of contrastive divergence steps. Default is 1.
Remarks
This method implements the greedy layer-wise pre-training algorithm for Deep Belief Networks. Each RBM layer is trained separately, starting from the bottom layer and moving up. After a layer is trained, the training data is transformed through that layer to create the training data for the next layer. This bottom-up approach helps the network learn a hierarchical representation of the data.
For Beginners: This method teaches each floor of the tower one at a time, from bottom to top.
The pre-training process works like this:
- Start with the raw input data and train the bottom RBM layer
- Use the bottom layer to transform the data and train the second layer
- Continue this process, training each layer with data transformed by all layers below it
This step-by-step approach:
- Helps the network learn increasingly abstract patterns
- Makes training more stable and effective
- Allows the network to learn useful features even without labeled data
After pre-training, the network has learned general patterns in your data and is ready for fine-tuning on specific tasks.
SerializeNetworkSpecificData(BinaryWriter)
Serializes network-specific data for the Deep Belief Network.
protected override void SerializeNetworkSpecificData(BinaryWriter writer)
Parameters
writerBinaryWriterThe BinaryWriter to write the data to.
Remarks
This method writes the specific configuration and state of the Deep Belief Network to a binary stream. This includes training parameters and RBM layer configurations that need to be preserved for later reconstruction of the network.
For Beginners: This method saves the unique settings of your Deep Belief Network.
It writes:
- The number of RBM layers
- The configuration of each RBM layer
- Training parameters like learning rate, epochs, and batch size
Saving these details allows you to recreate the exact same network structure and state later.
Train(Tensor<T>, Tensor<T>)
Performs supervised fine-tuning of the Deep Belief Network after pre-training.
public override void Train(Tensor<T> input, Tensor<T> expectedOutput)
Parameters
inputTensor<T>The input training data.
expectedOutputTensor<T>The expected output for the given input data.
Remarks
This method implements the supervised fine-tuning phase of training a Deep Belief Network. After the unsupervised pre-training of individual RBM layers, this method uses backpropagation and gradient descent to fine-tune the entire network for a specific task. This phase helps the network adapt its pre-trained features to perform well on the specific supervised learning task.
For Beginners: This method fine-tunes the entire network for a specific task.
After pre-training each layer individually:
- We now train the entire network end-to-end
- We use labeled data (inputs with known correct outputs)
- The network compares its predictions with the expected outputs
- It adjusts its parameters to make its predictions more accurate
Think of it like:
- Pre-training: Teaching general pattern recognition (like learning to see)
- Fine-tuning: Teaching a specific task using those patterns (like identifying specific objects)
This two-phase approach often works better than training from scratch, especially when you don't have a lot of labeled examples.
UpdateParameters(Vector<T>)
Updates the parameters of all layers in the Deep Belief Network.
public override void UpdateParameters(Vector<T> parameters)
Parameters
parametersVector<T>A vector containing the parameters to update all layers with.
Remarks
This method distributes the provided parameter vector among all the layers in the network. Each layer receives a portion of the parameter vector corresponding to its number of parameters. The method keeps track of the starting index for each layer's parameters in the input vector. This is typically used during the supervised fine-tuning phase that follows pre-training.
For Beginners: This method adjusts the network's internal values during fine-tuning.
When updating parameters:
- The input is a long list of numbers representing all values in the entire network
- The method divides this list into smaller chunks
- Each layer gets its own chunk of values
- The layers use these values to adjust their internal settings
After pre-training the individual RBM layers, this method helps fine-tune the entire network to improve its performance on specific tasks.