Class DeepBoltzmannMachine<T>
- Namespace
- AiDotNet.NeuralNetworks
- Assembly
- AiDotNet.dll
Represents a Deep Boltzmann Machine (DBM), a hierarchical generative model consisting of multiple layers of stochastic neurons.
public class DeepBoltzmannMachine<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations, typically float or double.
- Inheritance
-
DeepBoltzmannMachine<T>
- Implements
- Inherited Members
- Extension Methods
Remarks
A Deep Boltzmann Machine is an extension of the Restricted Boltzmann Machine to multiple hidden layers. It consists of a visible layer and multiple hidden layers with connections between adjacent layers but no connections within the same layer. DBMs are used for unsupervised learning, feature extraction, and generative modeling.
For Beginners: A Deep Boltzmann Machine is like a multi-story pattern detector.
Think of it as a series of layers, each learning increasingly abstract patterns:
- The visible layer represents the raw data (e.g., pixel values in an image)
- The first hidden layer might learn simple patterns (e.g., edges, corners)
- Higher hidden layers learn more complex patterns (e.g., shapes, objects)
- The deeper the network, the more abstract the patterns it can learn
For example, in an image recognition system:
- Layer 1 might detect edges and basic textures
- Layer 2 might combine these into simple shapes
- Layer 3 might recognize more complex objects
DBMs can both recognize patterns in data and generate new data with similar patterns.
Constructors
DeepBoltzmannMachine(NeuralNetworkArchitecture<T>, int, T, double, ILossFunction<T>?, IActivationFunction<T>?, int, int)
Initializes a new instance of the DeepBoltzmannMachine class with scalar activation.
public DeepBoltzmannMachine(NeuralNetworkArchitecture<T> architecture, int epochs, T learningRate, double learningRateDecay = 1, ILossFunction<T>? lossFunction = null, IActivationFunction<T>? activationFunction = null, int batchSize = 32, int cdSteps = 1)
Parameters
architectureNeuralNetworkArchitecture<T>The neural network architecture configuration.
epochsintThe number of training epochs.
learningRateTThe learning rate for parameter updates.
learningRateDecaydoubleThe learning rate decay factor per epoch. Default is 1.0 (no decay).
lossFunctionILossFunction<T>activationFunctionIActivationFunction<T>The scalar activation function to use. Default is sigmoid.
batchSizeintThe number of examples in each training batch. Default is 32.
cdStepsintThe number of contrastive divergence steps. Default is 1.
Remarks
This constructor creates a Deep Boltzmann Machine with the specified architecture and training parameters, using a scalar activation function that is applied element-wise to unit activations.
DeepBoltzmannMachine(NeuralNetworkArchitecture<T>, int, T, double, ILossFunction<T>?, IVectorActivationFunction<T>?, int, int)
Initializes a new instance of the DeepBoltzmannMachine class with vector activation.
public DeepBoltzmannMachine(NeuralNetworkArchitecture<T> architecture, int epochs, T learningRate, double learningRateDecay = 1, ILossFunction<T>? lossFunction = null, IVectorActivationFunction<T>? vectorActivationFunction = null, int batchSize = 32, int cdSteps = 1)
Parameters
architectureNeuralNetworkArchitecture<T>The neural network architecture configuration.
epochsintThe number of training epochs.
learningRateTThe learning rate for parameter updates.
learningRateDecaydoubleThe learning rate decay factor per epoch. Default is 1.0 (no decay).
lossFunctionILossFunction<T>vectorActivationFunctionIVectorActivationFunction<T>The vector activation function to use. Default is sigmoid.
batchSizeintThe number of examples in each training batch. Default is 32.
cdStepsintThe number of contrastive divergence steps. Default is 1.
Remarks
This constructor creates a Deep Boltzmann Machine with the specified architecture and training parameters, using a vector activation function that processes entire tensors at once for improved performance.
Methods
CreateNewInstance()
Creates a new instance of the deep boltzmann machine model.
protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()
Returns
- IFullModel<T, Tensor<T>, Tensor<T>>
A new instance of the deep boltzmann machine model with the same configuration.
Remarks
This method creates a new instance of the deep boltzmann machine model with the same configuration as the current instance. It is used internally during serialization/deserialization processes to create a fresh instance that can be populated with the serialized data. The new instance will have the same architecture, training parameters, and activation function type as the original.
For Beginners: This method creates a copy of the network structure without copying the learned data.
Think of it like making a blueprint copy of the DBM:
- It copies the same multi-layer structure (architecture)
- It uses the same learning settings (learning rate, epochs, etc.)
- It keeps the same activation function (how neurons respond to input)
- But it doesn't copy any of the weights and biases (the learned knowledge)
This is primarily used when saving or loading models, creating an empty framework that the saved parameters can be loaded into later.
DeserializeNetworkSpecificData(BinaryReader)
Deserializes Deep Boltzmann Machine-specific data from a binary reader.
protected override void DeserializeNetworkSpecificData(BinaryReader reader)
Parameters
readerBinaryReaderThe BinaryReader to read the data from.
Remarks
This method reads the specific parameters and state of the Deep Boltzmann Machine from a binary stream. It reconstructs the layer sizes, training parameters, activation functions, weights, and biases. After reading all data, it reinitializes the layers to ensure the network structure is properly set up.
For Beginners: This method rebuilds the DBM from saved data.
Imagine "unpacking" the neural network suitcase we packed earlier:
- We unpack the network's structure (number and sizes of layers)
- We set up the learning settings (learning rate, epochs, etc.)
- We restore the activation function
- We carefully place all the weights and biases back where they belong
After unpacking, we make sure everything is connected properly (reinitialize layers). This allows us to continue using the network exactly where we left off, with all its learned knowledge intact.
GetModelMetadata()
Gets metadata about the Deep Boltzmann Machine model.
public override ModelMetadata<T> GetModelMetadata()
Returns
- ModelMetadata<T>
A ModelMetaData object containing information about the model.
Remarks
This method returns metadata about the DBM, including the model type, number of layers, layer sizes, and training parameters. This information can be useful for model management and serialization.
InitializeLayers()
Initializes the layers of the neural network.
protected override void InitializeLayers()
Remarks
This method sets up the layer structure of the DBM based on the provided architecture. It either uses user-specified layers or creates default layers if none are provided. After initializing the layers, it extracts the layer sizes and initializes the parameters.
Predict(Tensor<T>)
Makes a prediction using the Deep Boltzmann Machine.
public override Tensor<T> Predict(Tensor<T> input)
Parameters
inputTensor<T>The input tensor to make predictions for.
Returns
- Tensor<T>
The predicted reconstruction of the input.
Remarks
This method makes a prediction by reconstructing the input through the DBM. It propagates the input up through all hidden layers and then back down to generate a reconstruction of the original input.
PretrainLayerwise(Tensor<T>, int, T)
Performs layer-wise pretraining of the DBM using a greedy approach.
public void PretrainLayerwise(Tensor<T> input, int pretrainingEpochs, T pretrainingLearningRate)
Parameters
inputTensor<T>The input training data.
pretrainingEpochsintThe number of epochs for pretraining each layer.
pretrainingLearningRateTThe learning rate for pretraining.
Remarks
This method pretrains the DBM layer by layer, treating each adjacent pair of layers as a separate RBM. This greedy approach often leads to better final results.
For Beginners: This is like training the network one layer at a time.
Instead of training the whole network at once:
- First train the bottom two layers (visible and first hidden)
- Then freeze the first hidden layer and train it with the second hidden layer
- Continue this process up the network
This step-by-step approach:
- Makes training more stable
- Often leads to better final results
- Can be thought of as "teaching the basics before the advanced concepts"
SerializeNetworkSpecificData(BinaryWriter)
Serializes Deep Boltzmann Machine-specific data to a binary writer.
protected override void SerializeNetworkSpecificData(BinaryWriter writer)
Parameters
writerBinaryWriterThe BinaryWriter to write the data to.
Remarks
This method writes the specific parameters and state of the Deep Boltzmann Machine to a binary stream. It includes layer sizes, training parameters, activation functions, weights, and biases.
For Beginners: This method saves all the important details of the DBM to a file.
Think of it like packing a suitcase for your neural network:
- We pack the number and sizes of layers (the network's structure)
- We include training settings like learning rate and epochs (how the network learns)
- We save the activation function (how neurons in the network activate)
- We carefully pack all the weights and biases (what the network has learned)
This allows us to later "unpack" the network exactly as it was, preserving all its learned knowledge.
Train(Tensor<T>, Tensor<T>)
Trains the Deep Boltzmann Machine on the provided data.
public override void Train(Tensor<T> input, Tensor<T> expectedOutput)
Parameters
inputTensor<T>The input training data.
expectedOutputTensor<T>The expected output (unused in DBMs, as they are self-supervised).
Remarks
This method trains the DBM on the provided data for the specified number of epochs. It divides the data into batches and trains on each batch, tracking and reporting the average loss for each epoch. The learning rate decays according to the specified learning rate decay factor.
For Beginners: This method teaches the DBM to recognize patterns in your data.
The training process:
- Divides your data into smaller batches for efficient processing
- Processes each batch through the DBM
- Updates the weights and biases to better reconstruct the input
- Repeats this for the specified number of epochs
- Tracks and reports the average error for each epoch
You should see the error decrease over time as the DBM learns.
UpdateParameters(Vector<T>)
Updates the parameters of the DBM with the given vector of parameter values.
public override void UpdateParameters(Vector<T> parameters)
Parameters
parametersVector<T>A vector containing all parameters to set.
Remarks
This method updates all the parameters of the DBM (weights and biases) from a single vector. It expects the parameters to be arranged in the same order as they are returned by GetParameters.