Class RestrictedBoltzmannMachine<T>

Namespace: AiDotNet.NeuralNetworks

Assembly: AiDotNet.dll

Represents a Restricted Boltzmann Machine, which is a type of neural network that learns probability distributions over its inputs.

public class RestrictedBoltzmannMachine<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T: The numeric type used for calculations, typically float or double.

Inheritance: object

NeuralNetworkBase<T>

RestrictedBoltzmannMachine<T>

Implements: INeuralNetworkModel<T>

INeuralNetwork<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

IInterpretableModel<T>

IInputGradientComputable<T>

IDisposable

Inherited Members: NeuralNetworkBase<T>.Layers

NeuralNetworkBase<T>.LayerCount

NeuralNetworkBase<T>.Architecture

NeuralNetworkBase<T>.NumOps

NeuralNetworkBase<T>.Engine

NeuralNetworkBase<T>._layerInputs

NeuralNetworkBase<T>._layerOutputs

NeuralNetworkBase<T>.Random

NeuralNetworkBase<T>.LossFunction

NeuralNetworkBase<T>.LastLoss

NeuralNetworkBase<T>.IsTrainingMode

NeuralNetworkBase<T>.SupportsTraining

NeuralNetworkBase<T>.SupportsGpuTraining

NeuralNetworkBase<T>.CanTrainOnGpu

NeuralNetworkBase<T>.GpuEngine

NeuralNetworkBase<T>.MaxGradNorm

NeuralNetworkBase<T>._mixedPrecisionContext

NeuralNetworkBase<T>._memoryManager

NeuralNetworkBase<T>.IsMemoryManagementEnabled

NeuralNetworkBase<T>.IsGradientCheckpointingEnabled

NeuralNetworkBase<T>.IsMixedPrecisionEnabled

NeuralNetworkBase<T>.ClipGradients(List<Tensor<T>>)

NeuralNetworkBase<T>.ClipGradient(Tensor<T>)

NeuralNetworkBase<T>.ClipGradient(Vector<T>)

NeuralNetworkBase<T>.Backpropagate(Tensor<T>)

NeuralNetworkBase<T>.BackpropagateWithRecompute(Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpuDeferred(IGpuTensor<T>, GpuExecutionOptions)

NeuralNetworkBase<T>.UpdateParametersGpu(T, T, T)

NeuralNetworkBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

NeuralNetworkBase<T>.UpdateParametersGpuDeferred(IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferred(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferredAsync(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions, CancellationToken)

NeuralNetworkBase<T>.UploadWeightsToGpu()

NeuralNetworkBase<T>.DownloadWeightsFromGpu()

NeuralNetworkBase<T>.ZeroGradientsGpu()

NeuralNetworkBase<T>.ExtractSingleExample(Tensor<T>, int)

NeuralNetworkBase<T>.ForwardWithMemory(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithCheckpointing(Tensor<T>)

NeuralNetworkBase<T>.CanUseGpuResidentPath()

NeuralNetworkBase<T>.TryForwardGpuOptimized(Tensor<T>, out Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferred(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferredAsync(Tensor<T>, CancellationToken)

NeuralNetworkBase<T>.BeginGpuExecution(GpuExecutionOptions)

NeuralNetworkBase<T>.ForwardWithGpuContext(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithGpuContext(IGpuTensor<T>)

NeuralNetworkBase<T>.GetGpuMemoryStats()

NeuralNetworkBase<T>.ForwardWithFeatures(Tensor<T>, int[])

NeuralNetworkBase<T>.GetParameterCount()

NeuralNetworkBase<T>.InvalidateParameterCountCache()

NeuralNetworkBase<T>.AddLayerToCollection(ILayer<T>)

NeuralNetworkBase<T>.RemoveLayerFromCollection(ILayer<T>)

NeuralNetworkBase<T>.ClearLayers()

NeuralNetworkBase<T>.ValidateCustomLayers(List<ILayer<T>>)

NeuralNetworkBase<T>.ValidateCustomLayersInternal(List<ILayer<T>>)

NeuralNetworkBase<T>.IsValidInputLayer(ILayer<T>)

NeuralNetworkBase<T>.IsValidOutputLayer(ILayer<T>)

NeuralNetworkBase<T>.AreLayersCompatible(ILayer<T>, ILayer<T>)

NeuralNetworkBase<T>.GetParameterGradients()

NeuralNetworkBase<T>.EnsureArchitectureInitialized()

NeuralNetworkBase<T>.SetTrainingMode(bool)

NeuralNetworkBase<T>.EnableMemoryManagement(TrainingMemoryConfig)

NeuralNetworkBase<T>.DisableMemoryManagement()

NeuralNetworkBase<T>.GetMemoryEstimate(int, int)

NeuralNetworkBase<T>.GetLastLoss()

NeuralNetworkBase<T>.ResetState()

NeuralNetworkBase<T>.BackwardWithInputGradient(Tensor<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Vector<T>, Vector<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.SaveModel(string)

NeuralNetworkBase<T>.LoadModel(string)

NeuralNetworkBase<T>.Serialize()

NeuralNetworkBase<T>.Deserialize(byte[])

NeuralNetworkBase<T>.WithParameters(Vector<T>)

NeuralNetworkBase<T>.GetActiveFeatureIndices()

NeuralNetworkBase<T>.IsFeatureUsed(int)

NeuralNetworkBase<T>.DeepCopy()

NeuralNetworkBase<T>.Clone()

NeuralNetworkBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

NeuralNetworkBase<T>._enabledMethods

NeuralNetworkBase<T>._sensitiveFeatures

NeuralNetworkBase<T>._fairnessMetrics

NeuralNetworkBase<T>._baseModel

NeuralNetworkBase<T>.GetGlobalFeatureImportanceAsync()

NeuralNetworkBase<T>.GetLocalFeatureImportanceAsync(Tensor<T>)

NeuralNetworkBase<T>.GetShapValuesAsync(Tensor<T>)

NeuralNetworkBase<T>.GetLimeExplanationAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetPartialDependenceAsync(Vector<int>, int)

NeuralNetworkBase<T>.GetCounterfactualAsync(Tensor<T>, Tensor<T>, int)

NeuralNetworkBase<T>.GetModelSpecificInterpretabilityAsync()

NeuralNetworkBase<T>.GenerateTextExplanationAsync(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.GetFeatureInteractionAsync(int, int)

NeuralNetworkBase<T>.ValidateFairnessAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetAnchorExplanationAsync(Tensor<T>, T)

NeuralNetworkBase<T>.SetBaseModel<TInput, TOutput>(IFullModel<T, TInput, TOutput>)

NeuralNetworkBase<T>.EnableMethod(params InterpretationMethod[])

NeuralNetworkBase<T>.ConfigureFairness(Vector<int>, params FairnessMetric[])

NeuralNetworkBase<T>.GetNamedLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.GetArchitecture()

NeuralNetworkBase<T>.GetFeatureImportance()

NeuralNetworkBase<T>.AddLayer(LayerType, int, ActivationFunction)

NeuralNetworkBase<T>.AddConvolutionalLayer(int, int, int, ActivationFunction)

NeuralNetworkBase<T>.AddLSTMLayer(int, bool)

NeuralNetworkBase<T>.AddDropoutLayer(double)

NeuralNetworkBase<T>.AddBatchNormalizationLayer(int, double, double)

NeuralNetworkBase<T>.AddPoolingLayer(int[], PoolingType, int, int?)

NeuralNetworkBase<T>.GetGradients()

NeuralNetworkBase<T>.GetInputShape()

NeuralNetworkBase<T>.GetLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.DefaultLossFunction

NeuralNetworkBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

NeuralNetworkBase<T>.ApplyGradients(Vector<T>, T)

NeuralNetworkBase<T>.SaveState(Stream)

NeuralNetworkBase<T>.LoadState(Stream)

NeuralNetworkBase<T>.Dispose()

NeuralNetworkBase<T>.Dispose(bool)

NeuralNetworkBase<T>.SupportsJitCompilation

NeuralNetworkBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

NeuralNetworkBase<T>.ConvertLayerToGraph(ILayer<T>, ComputationNode<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

A Restricted Boltzmann Machine (RBM) is a two-layer neural network that learns to reconstruct its input data. Unlike feedforward networks, RBMs are generative models that learn the probability distribution of the training data. They consist of a visible layer (representing the input data) and a hidden layer (representing features), with connections between layers but no connections within a layer (hence "restricted"). RBMs are trained using an algorithm called Contrastive Divergence, which involves both forward and backward passes between layers.

For Beginners: A Restricted Boltzmann Machine is like a two-way translator between data and features.

Think of it like this:

The visible layer is like words in English
The hidden layer is like words in French
The network learns how to translate back and forth between the languages

When you train an RBM:

It learns to recognize patterns in your data (translate English to French)
It also learns to recreate the original data from those patterns (translate French back to English)

For example, if you train an RBM on images of faces:

The visible layer represents the pixel values of the images
The hidden layer might learn to recognize features like "has a mustache" or "is smiling"
Once trained, you could activate certain hidden units to generate new face images with specific features

RBMs can be used for dimensionality reduction, feature learning, pattern completion, and even generating new data samples similar to the training data.

Constructors

RestrictedBoltzmannMachine(NeuralNetworkArchitecture<T>, int, int, double, int, IActivationFunction<T>?, ILossFunction<T>?)

Initializes a new instance of the RestrictedBoltzmannMachine<T> class with the specified architecture, sizes, and scalar activation function.

public RestrictedBoltzmannMachine(NeuralNetworkArchitecture<T> architecture, int visibleSize, int hiddenSize, double learningRate = 0.01, int cdSteps = 1, IActivationFunction<T>? scalarActivation = null, ILossFunction<T>? lossFunction = null)

Parameters

architecture NeuralNetworkArchitecture<T>: The neural network architecture to use for the RBM.
visibleSize int: The number of neurons in the visible layer.
hiddenSize int: The number of neurons in the hidden layer.
learningRate double
cdSteps int
scalarActivation IActivationFunction<T>: The scalar activation function to use. If null, a default activation is used.
lossFunction ILossFunction<T>

Remarks

This constructor creates a new Restricted Boltzmann Machine with the specified visible and hidden layer sizes, using the provided scalar activation function. It initializes weights to small random values and biases to zero, which is a common starting point for training RBMs.

For Beginners: This sets up the RBM with specific dimensions and an activation function that works on one neuron at a time.

When creating a new RBM this way:

You specify how many visible neurons (input values) you have
You specify how many hidden neurons (feature detectors) you want
You can optionally provide a specific activation function

The constructor sets up:

A weights matrix connecting all visible neurons to all hidden neurons
Bias values for all neurons (initially set to zero)
The specified scalar activation function

This prepares the RBM for training, but it won't actually learn anything until you train it with data.

RestrictedBoltzmannMachine(NeuralNetworkArchitecture<T>, int, int, double, int, IVectorActivationFunction<T>?, ILossFunction<T>?)

Initializes a new instance of the RestrictedBoltzmannMachine<T> class with the specified architecture, sizes, and vector activation function.

public RestrictedBoltzmannMachine(NeuralNetworkArchitecture<T> architecture, int visibleSize, int hiddenSize, double learningRate = 0.01, int cdSteps = 1, IVectorActivationFunction<T>? vectorActivation = null, ILossFunction<T>? lossFunction = null)

Parameters

architecture NeuralNetworkArchitecture<T>: The neural network architecture to use for the RBM.
visibleSize int: The number of neurons in the visible layer.
hiddenSize int: The number of neurons in the hidden layer.
learningRate double
cdSteps int
vectorActivation IVectorActivationFunction<T>: The vector activation function to use. If null, a default activation is used.
lossFunction ILossFunction<T>

Remarks

This constructor creates a new Restricted Boltzmann Machine with the specified visible and hidden layer sizes, using the provided vector activation function. It initializes weights to small random values and biases to zero, which is a common starting point for training RBMs. The vector activation function operates on entire layers at once, which may be more efficient for certain implementations.

For Beginners: This sets up the RBM with specific dimensions and an activation function that works on many neurons at once.

When creating a new RBM this way:

You specify how many visible neurons (input values) you have
You specify how many hidden neurons (feature detectors) you want
You can optionally provide a specific vector activation function

The constructor sets up:

A weights matrix connecting all visible neurons to all hidden neurons
Bias values for all neurons (initially set to zero)
The specified vector activation function

The main difference from the previous constructor is that this one uses an activation function that can process all neurons in a layer simultaneously, which can be more efficient.

Properties

HiddenSize

Gets the number of neurons in the hidden layer.

public int HiddenSize { get; }

Property Value

int

Remarks

The hidden size determines the capacity of the RBM to learn patterns and features from the input data. A larger hidden size allows the RBM to learn more complex representations but may require more data and time to train effectively.

For Beginners: This is how many pattern detectors or features the RBM can learn.

Choosing the right hidden size is important:

Too small: The RBM won't be able to capture all important patterns in your data
Too large: The RBM might "memorize" the training data instead of learning general patterns

For example, if analyzing face images:

HiddenSize = 10 might only let the RBM learn very basic features
HiddenSize = 100 might allow it to learn more subtle patterns like facial expressions

Think of it as the number of "concepts" the network can understand about your data.

ParameterCount

Gets the total number of parameters (weights and biases) in the RBM.

public override int ParameterCount { get; }

Property Value

int

Remarks

The parameter count includes: - Weights matrix: HiddenSize × VisibleSize parameters - Visible biases: VisibleSize parameters - Hidden biases: HiddenSize parameters

For Beginners: This tells you the total number of learnable values in the RBM. More parameters means the RBM can learn more complex patterns, but also requires more data and computation.

VisibleSize

Gets the number of neurons in the visible layer.

public int VisibleSize { get; }

Property Value

int

Remarks

The visible size determines the dimensionality of the input data that the RBM can process. It should match the number of features in the input data (e.g., the number of pixels in an image).

For Beginners: This is how many input values the RBM can accept.

For example:

If processing 28×28 pixel images, VisibleSize would be 784 (28×28)
If processing customer data with 15 attributes, VisibleSize would be 15

Think of it as the number of "sensors" the network has to observe the input data.

Methods

ComputeReconstructionError(Tensor<T>)

public T ComputeReconstructionError(Tensor<T> input)

Parameters

input Tensor<T>

Returns

T

CreateNewInstance()

Creates a new instance of the same type as this neural network.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>: A new instance of the same neural network type.

Remarks

For Beginners: This creates a blank version of the same type of neural network.

It's used internally by methods like DeepCopy and Clone to create the right type of network before copying the data into it.

DeserializeNetworkSpecificData(BinaryReader)

Deserializes RBM-specific data from a binary reader.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader: The binary reader to read from.

Remarks

This method loads RBM-specific data from the binary stream, including the weights, biases, and configuration parameters like learning rate and CD steps. It restores the RBM to the exact state it was in when serialized.

For Beginners: This method loads all the RBM's saved knowledge from a file.

The deserialization process loads:

All weights between visible and hidden neurons
All bias values for both layers
Configuration settings like learning rate

This allows you to restore a previously trained RBM exactly as it was, without needing to retrain it from scratch.

ExtractFeatures(Tensor<T>, bool)

Extracts features from input data using the trained RBM.

public Tensor<T> ExtractFeatures(Tensor<T> input, bool binarize = false)

Parameters

input Tensor<T>: The input data tensor.
binarize bool: Whether to binarize the hidden activations.

Returns

Tensor<T>: The hidden layer features as a tensor.

Remarks

This method transforms input data into features learned by the RBM's hidden layer. It can be used for feature extraction, dimensionality reduction, or as a pre-processing step before using the data with another algorithm.

For Beginners: This method converts raw data into abstract features.

When extracting features:

The input data is passed to the visible layer
The hidden layer activations represent learned features
These features can capture important patterns in the data

You can choose to get:

Probability values (binarize=false) showing how strongly each feature is detected
Binary values (binarize=true) indicating whether each feature is present or not

This is useful for:

Reducing data dimensionality (e.g., compressing 784 pixels to 100 features)
Extracting meaningful patterns for other algorithms to use
Pre-processing data for classification or other tasks

GenerateSamples(int, int)

Generates samples from the RBM by starting with a random visible state and performing Gibbs sampling.

public Tensor<T> GenerateSamples(int numSamples, int numSteps = 1000)

Parameters

numSamples int: The number of samples to generate.
numSteps int: The number of Gibbs sampling steps to perform.

Returns

Tensor<T>: Tensor containing the generated samples.

Remarks

This method generates new data samples that follow the distribution learned by the RBM. It starts with random visible units, then repeatedly samples the hidden and visible layers in a process called Gibbs sampling to get samples from the model's learned distribution.

For Beginners: This method creates new data samples based on patterns the RBM has learned.

The generation process works like this:

Start with random values for the visible layer
Compute hidden layer activations based on these visible values
Reconstruct a new visible layer from the hidden activations
Repeat steps 2-3 multiple times (Gibbs sampling)
Return the final visible layer as a generated sample

This allows the RBM to "dream up" new data that resembles the training data. For example, if trained on face images, it might generate new faces that don't exist but look realistic.

GetHiddenLayerActivation(Tensor<T>)

Calculates the activation probabilities of the hidden layer given the visible layer.

public Tensor<T> GetHiddenLayerActivation(Tensor<T> visibleLayer)

Parameters

visibleLayer Tensor<T>: The visible layer tensor.

Returns

Tensor<T>: A tensor containing the activation probabilities of the hidden layer.

Remarks

This method computes the activation probabilities of each hidden unit given the state of the visible layer. It calculates the weighted sum of visible unit values for each hidden unit, adds the hidden bias, and applies the activation function to obtain the probability of activation.

For Beginners: This method finds which patterns or features are present in the input data.

When calculating hidden layer activations:

Each hidden neuron receives input from all visible neurons
The inputs are weighted by the connection strengths
The hidden neuron's bias is added
An activation function converts this sum to a probability

This is like asking each feature detector: "Based on what you see in the input data, how confident are you that your specific pattern is present?"

The result is a set of probabilities for each hidden neuron, indicating how strongly each feature is detected in the current input.

Exceptions

InvalidOperationException: Thrown when no activation function is specified.

GetModelMetadata()

Gets metadata about the RBM model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>: A ModelMetaData object containing information about the RBM.

Remarks

This method returns comprehensive metadata about the RBM, including its architecture, layer sizes, and other relevant parameters. This information is useful for model management, tracking experiments, and reporting.

For Beginners: This provides detailed information about your RBM.

The metadata includes:

The sizes of visible and hidden layers
Information about the activation functions used
The total number of parameters (weights and biases)
Other configuration details

This information is useful for documentation, comparing different RBM configurations, and understanding the structure of your model at a glance.

GetParameters()

Updates the parameters of the RBM. This method is not typically used in RBMs and throws a NotImplementedException.

public override Vector<T> GetParameters()

Returns

Vector<T>

Remarks

RBMs typically use specialized training algorithms like Contrastive Divergence rather than the generic parameter update approach used by other neural networks. This method throws a NotImplementedException to indicate that RBMs should be trained using the Train method instead.

For Beginners: This method is not used in RBMs because they train differently.

While standard neural networks update their parameters based on error gradients:

RBMs use a different approach called Contrastive Divergence
They compare "reality" (input data) with "imagination" (reconstructions)
They directly adjust weights based on this comparison

Instead of using this method, you should use the Train method to train an RBM.

Exceptions

NotImplementedException: Always thrown as this method is not implemented for RBMs.

GetVisibleLayerActivation(Tensor<T>)

Calculates the activation probabilities of the visible layer given the hidden layer.

public Tensor<T> GetVisibleLayerActivation(Tensor<T> hiddenLayer)

Parameters

hiddenLayer Tensor<T>: The hidden layer tensor.

Returns

Tensor<T>: A tensor containing the activation probabilities of the visible layer.

Remarks

This method computes the activation probabilities of each visible unit given the state of the hidden layer. It calculates the weighted sum of hidden unit values for each visible unit, adds the visible bias, and applies the activation function to obtain the probability of activation.

For Beginners: This method reconstructs the input data based on detected patterns.

When calculating visible layer activations:

Each visible neuron receives input from all hidden neurons
The inputs are weighted by the connection strengths
The visible neuron's bias is added
An activation function converts this sum to a probability

This is like asking each input neuron: "Based on the patterns the network detected, what's the probability that you should be active?"

The result is a reconstruction of the input data based on the patterns detected, which might not be identical to the original input.

Exceptions

InvalidOperationException: Thrown when no activation function is specified.

InitializeLayers()

Initializes the neural network layers. In an RBM, this method is typically empty as RBMs use direct weight and bias parameters rather than standard neural network layers.

protected override void InitializeLayers()

Remarks

RBMs differ from feedforward neural networks in that they don't use a layer-based computation model. Instead, they directly manipulate weights and biases for the visible and hidden units. Therefore, this method is typically empty or performs specialized initialization for RBMs.

For Beginners: RBMs work differently from standard neural networks.

While standard neural networks process data through sequential layers:

RBMs work by going back and forth between just two layers
They don't use the same layer concept as feedforward networks
They operate directly on the weights and biases connecting the visible and hidden layers

That's why this method is empty - the RBM initializes its weights and biases directly rather than creating a sequence of layers like a standard neural network.

Predict(Tensor<T>)

Makes predictions using the RBM by computing hidden layer activations.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor to process.

Returns

Tensor<T>: The hidden layer activations as a tensor.

Remarks

This method performs a forward pass through the RBM, mapping the input data to its corresponding hidden representation. For RBMs, "prediction" typically means extracting features or transforming the input data to a different representation.

For Beginners: This method extracts patterns or features from the input data.

Unlike standard neural networks that might predict a class or value:

RBMs transform input data into a representation of detected patterns
The output tells you which features or patterns were found in the input
This can be used for feature extraction or dimensionality reduction

For example, if your RBM has learned to recognize features in face images, this method would tell you which of those features (like "has glasses" or "is smiling") are present in a new face image you provide.

SerializeNetworkSpecificData(BinaryWriter)

Serializes the RBM-specific data to a binary writer.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter: The binary writer to write to.

Remarks

This method saves RBM-specific data to the binary stream, including the weights, biases, and configuration parameters like learning rate and CD steps.

For Beginners: This method saves all the RBM's learned knowledge to a file.

The serialization process saves:

All weights between visible and hidden neurons
All bias values for both layers
Configuration settings like learning rate

This allows you to save a trained RBM and reload it later without having to retrain it from scratch, which can be time-consuming.

SetParameters(Vector<T>)

Sets the parameters of the neural network.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: The parameters to set.

Remarks

This method distributes the parameters to all layers in the network. The parameters should be in the same format as returned by GetParameters.

SetTrainingParameters(T, int)

Sets the training parameters for the RBM.

public void SetTrainingParameters(T learningRate, int cdSteps = 1)

Parameters

learningRate T: The learning rate for weight updates.
cdSteps int: The number of Contrastive Divergence steps.

Remarks

This method configures the learning rate and the number of Contrastive Divergence steps used during training. The learning rate controls how quickly the RBM updates its weights, while the CD steps control how many Gibbs sampling steps are performed in each update.

For Beginners: This method lets you adjust how the RBM learns.

You can configure:

Learning rate: How big each learning step is (typical values: 0.001 to 0.1)
CD steps: How many back-and-forth cycles to run during training (often 1, sometimes more)

These parameters affect learning quality and speed:

Higher learning rates learn faster but may be less stable
More CD steps give more accurate updates but take longer

Finding the right balance for your specific data is important for effective training.

Train(Tensor<T>, Tensor<T>)

Trains the RBM using Contrastive Divergence.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>: The input data tensor.
expectedOutput Tensor<T>: Not used for RBMs as they are unsupervised models.

Remarks

This method implements Contrastive Divergence (CD) training for the RBM. It compares the correlation between visible and hidden units when driven by the data to the correlation when driven by the model's own reconstructions, and updates the weights and biases accordingly.

For Beginners: This method teaches the RBM to recognize patterns in your data.

The training process works like this:

Start with real data (the visible layer)
Compute which patterns (hidden layer) are activated by this data
Reconstruct an approximation of the data from these patterns
See what patterns this reconstruction would activate
Update the weights based on the difference between steps 2 and 4

The goal is for the RBM to generate reconstructions that are statistically similar to the real data, which means it has learned the underlying patterns.

Note that unlike supervised learning, RBMs don't use expected outputs - they learn the structure of the input data on their own.

UpdateParameters(Vector<T>)

Updates the network's parameters with new values.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: The new parameter values to set.

Remarks

For Beginners: During training, a neural network's internal values (parameters) get adjusted to improve its performance. This method allows you to update all those values at once by providing a complete set of new parameters.

This is typically used by optimization algorithms that calculate better parameter values based on training data.

Table of Contents

Class RestrictedBoltzmannMachine<T>

Type Parameters

Remarks

Constructors

RestrictedBoltzmannMachine(NeuralNetworkArchitecture<T>, int, int, double, int, IActivationFunction<T>?, ILossFunction<T>?)

Parameters

Remarks

RestrictedBoltzmannMachine(NeuralNetworkArchitecture<T>, int, int, double, int, IVectorActivationFunction<T>?, ILossFunction<T>?)

Parameters

Remarks

Properties

HiddenSize

Property Value

Remarks

ParameterCount

Property Value

Remarks

VisibleSize

Property Value

Remarks

Methods

ComputeReconstructionError(Tensor<T>)

Parameters

Returns

CreateNewInstance()

Returns

Remarks

DeserializeNetworkSpecificData(BinaryReader)

Parameters

Remarks

ExtractFeatures(Tensor<T>, bool)

Parameters

Returns

Remarks

GenerateSamples(int, int)

Parameters

Returns

Remarks

GetHiddenLayerActivation(Tensor<T>)

Parameters

Returns

Remarks

Exceptions

GetModelMetadata()

Returns

Remarks

GetParameters()

Returns

Remarks

Exceptions

GetVisibleLayerActivation(Tensor<T>)

Parameters

Returns

Remarks

Exceptions

InitializeLayers()

Remarks

Predict(Tensor<T>)

Parameters

Returns

Remarks

SerializeNetworkSpecificData(BinaryWriter)

Parameters

Remarks

SetParameters(Vector<T>)

Parameters

Remarks

SetTrainingParameters(T, int)

Parameters

Remarks

Train(Tensor<T>, Tensor<T>)

Parameters

Remarks

UpdateParameters(Vector<T>)

Parameters

Remarks