Class DeepBoltzmannMachine<T>

Namespace: AiDotNet.NeuralNetworks

Assembly: AiDotNet.dll

Represents a Deep Boltzmann Machine (DBM), a hierarchical generative model consisting of multiple layers of stochastic neurons.

public class DeepBoltzmannMachine<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

NeuralNetworkBase<T>

DeepBoltzmannMachine<T>

Implements: INeuralNetworkModel<T>

INeuralNetwork<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

IInterpretableModel<T>

IInputGradientComputable<T>

IDisposable

Inherited Members: NeuralNetworkBase<T>.Layers

NeuralNetworkBase<T>.LayerCount

NeuralNetworkBase<T>.Architecture

NeuralNetworkBase<T>.NumOps

NeuralNetworkBase<T>.Engine

NeuralNetworkBase<T>._layerInputs

NeuralNetworkBase<T>._layerOutputs

NeuralNetworkBase<T>.Random

NeuralNetworkBase<T>.LossFunction

NeuralNetworkBase<T>.LastLoss

NeuralNetworkBase<T>.IsTrainingMode

NeuralNetworkBase<T>.SupportsTraining

NeuralNetworkBase<T>.SupportsGpuTraining

NeuralNetworkBase<T>.CanTrainOnGpu

NeuralNetworkBase<T>.GpuEngine

NeuralNetworkBase<T>.MaxGradNorm

NeuralNetworkBase<T>._mixedPrecisionContext

NeuralNetworkBase<T>._memoryManager

NeuralNetworkBase<T>.IsMemoryManagementEnabled

NeuralNetworkBase<T>.IsGradientCheckpointingEnabled

NeuralNetworkBase<T>.IsMixedPrecisionEnabled

NeuralNetworkBase<T>.ClipGradients(List<Tensor<T>>)

NeuralNetworkBase<T>.ClipGradient(Tensor<T>)

NeuralNetworkBase<T>.ClipGradient(Vector<T>)

NeuralNetworkBase<T>.GetParameters()

NeuralNetworkBase<T>.Backpropagate(Tensor<T>)

NeuralNetworkBase<T>.BackpropagateWithRecompute(Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpuDeferred(IGpuTensor<T>, GpuExecutionOptions)

NeuralNetworkBase<T>.UpdateParametersGpu(T, T, T)

NeuralNetworkBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

NeuralNetworkBase<T>.UpdateParametersGpuDeferred(IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferred(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferredAsync(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions, CancellationToken)

NeuralNetworkBase<T>.UploadWeightsToGpu()

NeuralNetworkBase<T>.DownloadWeightsFromGpu()

NeuralNetworkBase<T>.ZeroGradientsGpu()

NeuralNetworkBase<T>.ExtractSingleExample(Tensor<T>, int)

NeuralNetworkBase<T>.ForwardWithMemory(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithCheckpointing(Tensor<T>)

NeuralNetworkBase<T>.CanUseGpuResidentPath()

NeuralNetworkBase<T>.TryForwardGpuOptimized(Tensor<T>, out Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferred(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferredAsync(Tensor<T>, CancellationToken)

NeuralNetworkBase<T>.BeginGpuExecution(GpuExecutionOptions)

NeuralNetworkBase<T>.ForwardWithGpuContext(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithGpuContext(IGpuTensor<T>)

NeuralNetworkBase<T>.GetGpuMemoryStats()

NeuralNetworkBase<T>.ForwardWithFeatures(Tensor<T>, int[])

NeuralNetworkBase<T>.ParameterCount

NeuralNetworkBase<T>.GetParameterCount()

NeuralNetworkBase<T>.InvalidateParameterCountCache()

NeuralNetworkBase<T>.AddLayerToCollection(ILayer<T>)

NeuralNetworkBase<T>.RemoveLayerFromCollection(ILayer<T>)

NeuralNetworkBase<T>.ClearLayers()

NeuralNetworkBase<T>.ValidateCustomLayers(List<ILayer<T>>)

NeuralNetworkBase<T>.ValidateCustomLayersInternal(List<ILayer<T>>)

NeuralNetworkBase<T>.IsValidInputLayer(ILayer<T>)

NeuralNetworkBase<T>.IsValidOutputLayer(ILayer<T>)

NeuralNetworkBase<T>.AreLayersCompatible(ILayer<T>, ILayer<T>)

NeuralNetworkBase<T>.GetParameterGradients()

NeuralNetworkBase<T>.EnsureArchitectureInitialized()

NeuralNetworkBase<T>.SetTrainingMode(bool)

NeuralNetworkBase<T>.EnableMemoryManagement(TrainingMemoryConfig)

NeuralNetworkBase<T>.DisableMemoryManagement()

NeuralNetworkBase<T>.GetMemoryEstimate(int, int)

NeuralNetworkBase<T>.GetLastLoss()

NeuralNetworkBase<T>.ResetState()

NeuralNetworkBase<T>.BackwardWithInputGradient(Tensor<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Vector<T>, Vector<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.SaveModel(string)

NeuralNetworkBase<T>.LoadModel(string)

NeuralNetworkBase<T>.Serialize()

NeuralNetworkBase<T>.Deserialize(byte[])

NeuralNetworkBase<T>.WithParameters(Vector<T>)

NeuralNetworkBase<T>.GetActiveFeatureIndices()

NeuralNetworkBase<T>.IsFeatureUsed(int)

NeuralNetworkBase<T>.DeepCopy()

NeuralNetworkBase<T>.Clone()

NeuralNetworkBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

NeuralNetworkBase<T>._enabledMethods

NeuralNetworkBase<T>._sensitiveFeatures

NeuralNetworkBase<T>._fairnessMetrics

NeuralNetworkBase<T>._baseModel

NeuralNetworkBase<T>.GetGlobalFeatureImportanceAsync()

NeuralNetworkBase<T>.GetLocalFeatureImportanceAsync(Tensor<T>)

NeuralNetworkBase<T>.GetShapValuesAsync(Tensor<T>)

NeuralNetworkBase<T>.GetLimeExplanationAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetPartialDependenceAsync(Vector<int>, int)

NeuralNetworkBase<T>.GetCounterfactualAsync(Tensor<T>, Tensor<T>, int)

NeuralNetworkBase<T>.GetModelSpecificInterpretabilityAsync()

NeuralNetworkBase<T>.GenerateTextExplanationAsync(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.GetFeatureInteractionAsync(int, int)

NeuralNetworkBase<T>.ValidateFairnessAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetAnchorExplanationAsync(Tensor<T>, T)

NeuralNetworkBase<T>.SetBaseModel<TInput, TOutput>(IFullModel<T, TInput, TOutput>)

NeuralNetworkBase<T>.EnableMethod(params InterpretationMethod[])

NeuralNetworkBase<T>.ConfigureFairness(Vector<int>, params FairnessMetric[])

NeuralNetworkBase<T>.GetNamedLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.GetArchitecture()

NeuralNetworkBase<T>.GetFeatureImportance()

NeuralNetworkBase<T>.SetParameters(Vector<T>)

NeuralNetworkBase<T>.AddLayer(LayerType, int, ActivationFunction)

NeuralNetworkBase<T>.AddConvolutionalLayer(int, int, int, ActivationFunction)

NeuralNetworkBase<T>.AddLSTMLayer(int, bool)

NeuralNetworkBase<T>.AddDropoutLayer(double)

NeuralNetworkBase<T>.AddBatchNormalizationLayer(int, double, double)

NeuralNetworkBase<T>.AddPoolingLayer(int[], PoolingType, int, int?)

NeuralNetworkBase<T>.GetGradients()

NeuralNetworkBase<T>.GetInputShape()

NeuralNetworkBase<T>.GetLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.DefaultLossFunction

NeuralNetworkBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

NeuralNetworkBase<T>.ApplyGradients(Vector<T>, T)

NeuralNetworkBase<T>.SaveState(Stream)

NeuralNetworkBase<T>.LoadState(Stream)

NeuralNetworkBase<T>.Dispose()

NeuralNetworkBase<T>.Dispose(bool)

NeuralNetworkBase<T>.SupportsJitCompilation

NeuralNetworkBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

NeuralNetworkBase<T>.ConvertLayerToGraph(ILayer<T>, ComputationNode<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

A Deep Boltzmann Machine is an extension of the Restricted Boltzmann Machine to multiple hidden layers. It consists of a visible layer and multiple hidden layers with connections between adjacent layers but no connections within the same layer. DBMs are used for unsupervised learning, feature extraction, and generative modeling.

For Beginners: A Deep Boltzmann Machine is like a multi-story pattern detector.

Think of it as a series of layers, each learning increasingly abstract patterns:

The visible layer represents the raw data (e.g., pixel values in an image)
The first hidden layer might learn simple patterns (e.g., edges, corners)
Higher hidden layers learn more complex patterns (e.g., shapes, objects)
The deeper the network, the more abstract the patterns it can learn

For example, in an image recognition system:

Layer 1 might detect edges and basic textures
Layer 2 might combine these into simple shapes
Layer 3 might recognize more complex objects

DBMs can both recognize patterns in data and generate new data with similar patterns.

Constructors

DeepBoltzmannMachine(NeuralNetworkArchitecture<T>, int, T, double, ILossFunction<T>?, IActivationFunction<T>?, int, int)

Initializes a new instance of the DeepBoltzmannMachine class with scalar activation.

public DeepBoltzmannMachine(NeuralNetworkArchitecture<T> architecture, int epochs, T learningRate, double learningRateDecay = 1, ILossFunction<T>? lossFunction = null, IActivationFunction<T>? activationFunction = null, int batchSize = 32, int cdSteps = 1)

Parameters

architecture NeuralNetworkArchitecture<T>: The neural network architecture configuration.
epochs int: The number of training epochs.
learningRate T: The learning rate for parameter updates.
learningRateDecay double: The learning rate decay factor per epoch. Default is 1.0 (no decay).
lossFunction ILossFunction<T>
activationFunction IActivationFunction<T>: The scalar activation function to use. Default is sigmoid.
batchSize int: The number of examples in each training batch. Default is 32.
cdSteps int: The number of contrastive divergence steps. Default is 1.

Remarks

This constructor creates a Deep Boltzmann Machine with the specified architecture and training parameters, using a scalar activation function that is applied element-wise to unit activations.

DeepBoltzmannMachine(NeuralNetworkArchitecture<T>, int, T, double, ILossFunction<T>?, IVectorActivationFunction<T>?, int, int)

Initializes a new instance of the DeepBoltzmannMachine class with vector activation.

public DeepBoltzmannMachine(NeuralNetworkArchitecture<T> architecture, int epochs, T learningRate, double learningRateDecay = 1, ILossFunction<T>? lossFunction = null, IVectorActivationFunction<T>? vectorActivationFunction = null, int batchSize = 32, int cdSteps = 1)

Parameters

architecture NeuralNetworkArchitecture<T>: The neural network architecture configuration.
epochs int: The number of training epochs.
learningRate T: The learning rate for parameter updates.
learningRateDecay double: The learning rate decay factor per epoch. Default is 1.0 (no decay).
lossFunction ILossFunction<T>
vectorActivationFunction IVectorActivationFunction<T>: The vector activation function to use. Default is sigmoid.
batchSize int: The number of examples in each training batch. Default is 32.
cdSteps int: The number of contrastive divergence steps. Default is 1.

Remarks

This constructor creates a Deep Boltzmann Machine with the specified architecture and training parameters, using a vector activation function that processes entire tensors at once for improved performance.

Methods

CreateNewInstance()

Creates a new instance of the deep boltzmann machine model.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>: A new instance of the deep boltzmann machine model with the same configuration.

Remarks

This method creates a new instance of the deep boltzmann machine model with the same configuration as the current instance. It is used internally during serialization/deserialization processes to create a fresh instance that can be populated with the serialized data. The new instance will have the same architecture, training parameters, and activation function type as the original.

For Beginners: This method creates a copy of the network structure without copying the learned data.

Think of it like making a blueprint copy of the DBM:

It copies the same multi-layer structure (architecture)
It uses the same learning settings (learning rate, epochs, etc.)
It keeps the same activation function (how neurons respond to input)
But it doesn't copy any of the weights and biases (the learned knowledge)

This is primarily used when saving or loading models, creating an empty framework that the saved parameters can be loaded into later.

DeserializeNetworkSpecificData(BinaryReader)

Deserializes Deep Boltzmann Machine-specific data from a binary reader.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader: The BinaryReader to read the data from.

Remarks

This method reads the specific parameters and state of the Deep Boltzmann Machine from a binary stream. It reconstructs the layer sizes, training parameters, activation functions, weights, and biases. After reading all data, it reinitializes the layers to ensure the network structure is properly set up.

For Beginners: This method rebuilds the DBM from saved data.

Imagine "unpacking" the neural network suitcase we packed earlier:

We unpack the network's structure (number and sizes of layers)
We set up the learning settings (learning rate, epochs, etc.)
We restore the activation function
We carefully place all the weights and biases back where they belong

After unpacking, we make sure everything is connected properly (reinitialize layers). This allows us to continue using the network exactly where we left off, with all its learned knowledge intact.

GetModelMetadata()

Gets metadata about the Deep Boltzmann Machine model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>: A ModelMetaData object containing information about the model.

Remarks

This method returns metadata about the DBM, including the model type, number of layers, layer sizes, and training parameters. This information can be useful for model management and serialization.

InitializeLayers()

Initializes the layers of the neural network.

protected override void InitializeLayers()

Remarks

This method sets up the layer structure of the DBM based on the provided architecture. It either uses user-specified layers or creates default layers if none are provided. After initializing the layers, it extracts the layer sizes and initializes the parameters.

Predict(Tensor<T>)

Makes a prediction using the Deep Boltzmann Machine.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor to make predictions for.

Returns

Tensor<T>: The predicted reconstruction of the input.

Remarks

This method makes a prediction by reconstructing the input through the DBM. It propagates the input up through all hidden layers and then back down to generate a reconstruction of the original input.

PretrainLayerwise(Tensor<T>, int, T)

Performs layer-wise pretraining of the DBM using a greedy approach.

public void PretrainLayerwise(Tensor<T> input, int pretrainingEpochs, T pretrainingLearningRate)

Parameters

input Tensor<T>: The input training data.
pretrainingEpochs int: The number of epochs for pretraining each layer.
pretrainingLearningRate T: The learning rate for pretraining.

Remarks

This method pretrains the DBM layer by layer, treating each adjacent pair of layers as a separate RBM. This greedy approach often leads to better final results.

For Beginners: This is like training the network one layer at a time.

Instead of training the whole network at once:

First train the bottom two layers (visible and first hidden)
Then freeze the first hidden layer and train it with the second hidden layer
Continue this process up the network

This step-by-step approach:

Makes training more stable
Often leads to better final results
Can be thought of as "teaching the basics before the advanced concepts"

SerializeNetworkSpecificData(BinaryWriter)

Serializes Deep Boltzmann Machine-specific data to a binary writer.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter: The BinaryWriter to write the data to.

Remarks

This method writes the specific parameters and state of the Deep Boltzmann Machine to a binary stream. It includes layer sizes, training parameters, activation functions, weights, and biases.

For Beginners: This method saves all the important details of the DBM to a file.

Think of it like packing a suitcase for your neural network:

We pack the number and sizes of layers (the network's structure)
We include training settings like learning rate and epochs (how the network learns)
We save the activation function (how neurons in the network activate)
We carefully pack all the weights and biases (what the network has learned)

This allows us to later "unpack" the network exactly as it was, preserving all its learned knowledge.

Train(Tensor<T>, Tensor<T>)

Trains the Deep Boltzmann Machine on the provided data.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>: The input training data.
expectedOutput Tensor<T>: The expected output (unused in DBMs, as they are self-supervised).

Remarks

This method trains the DBM on the provided data for the specified number of epochs. It divides the data into batches and trains on each batch, tracking and reporting the average loss for each epoch. The learning rate decays according to the specified learning rate decay factor.

For Beginners: This method teaches the DBM to recognize patterns in your data.

The training process:

Divides your data into smaller batches for efficient processing
Processes each batch through the DBM
Updates the weights and biases to better reconstruct the input
Repeats this for the specified number of epochs
Tracks and reports the average error for each epoch

You should see the error decrease over time as the DBM learns.

UpdateParameters(Vector<T>)

Updates the parameters of the DBM with the given vector of parameter values.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: A vector containing all parameters to set.

Remarks

This method updates all the parameters of the DBM (weights and biases) from a single vector. It expects the parameters to be arranged in the same order as they are returned by GetParameters.

Table of Contents

Class DeepBoltzmannMachine<T>

Type Parameters

Remarks

Constructors

DeepBoltzmannMachine(NeuralNetworkArchitecture<T>, int, T, double, ILossFunction<T>?, IActivationFunction<T>?, int, int)

Parameters

Remarks

DeepBoltzmannMachine(NeuralNetworkArchitecture<T>, int, T, double, ILossFunction<T>?, IVectorActivationFunction<T>?, int, int)

Parameters

Remarks

Methods

CreateNewInstance()

Returns

Remarks

DeserializeNetworkSpecificData(BinaryReader)

Parameters

Remarks

GetModelMetadata()

Returns

Remarks

InitializeLayers()

Remarks

Predict(Tensor<T>)

Parameters

Returns

Remarks

PretrainLayerwise(Tensor<T>, int, T)

Parameters

Remarks

SerializeNetworkSpecificData(BinaryWriter)

Parameters

Remarks

Train(Tensor<T>, Tensor<T>)

Parameters

Remarks

UpdateParameters(Vector<T>)

Parameters

Remarks