Class SiameseNetwork<T>

Namespace: AiDotNet.NeuralNetworks

Assembly: AiDotNet.dll

Implements a Siamese Neural Network for comparing pairs of inputs and determining their similarity.

public class SiameseNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable, IAuxiliaryLossLayer<T>, IDiagnosticsProvider

Type Parameters

T: The numeric type used for calculations (e.g., double, float).

Inheritance: object

NeuralNetworkBase<T>

SiameseNetwork<T>

Implements: INeuralNetworkModel<T>

INeuralNetwork<T>

IFullModel<T, Tensor<T>, Tensor<T>>

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Tensor<T>, Tensor<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>

IGradientComputable<T, Tensor<T>, Tensor<T>>

IJitCompilable<T>

IInterpretableModel<T>

IInputGradientComputable<T>

IDisposable

IAuxiliaryLossLayer<T>

IDiagnosticsProvider

Inherited Members: NeuralNetworkBase<T>.Layers

NeuralNetworkBase<T>.LayerCount

NeuralNetworkBase<T>.Architecture

NeuralNetworkBase<T>.NumOps

NeuralNetworkBase<T>.Engine

NeuralNetworkBase<T>._layerInputs

NeuralNetworkBase<T>._layerOutputs

NeuralNetworkBase<T>.Random

NeuralNetworkBase<T>.LossFunction

NeuralNetworkBase<T>.LastLoss

NeuralNetworkBase<T>.IsTrainingMode

NeuralNetworkBase<T>.SupportsTraining

NeuralNetworkBase<T>.SupportsGpuTraining

NeuralNetworkBase<T>.CanTrainOnGpu

NeuralNetworkBase<T>.GpuEngine

NeuralNetworkBase<T>.MaxGradNorm

NeuralNetworkBase<T>._mixedPrecisionContext

NeuralNetworkBase<T>._memoryManager

NeuralNetworkBase<T>.IsMemoryManagementEnabled

NeuralNetworkBase<T>.IsGradientCheckpointingEnabled

NeuralNetworkBase<T>.IsMixedPrecisionEnabled

NeuralNetworkBase<T>.ClipGradients(List<Tensor<T>>)

NeuralNetworkBase<T>.ClipGradient(Tensor<T>)

NeuralNetworkBase<T>.ClipGradient(Vector<T>)

NeuralNetworkBase<T>.GetParameters()

NeuralNetworkBase<T>.Backpropagate(Tensor<T>)

NeuralNetworkBase<T>.BackpropagateWithRecompute(Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpu(IGpuTensor<T>)

NeuralNetworkBase<T>.BackpropagateGpuDeferred(IGpuTensor<T>, GpuExecutionOptions)

NeuralNetworkBase<T>.UpdateParametersGpu(T, T, T)

NeuralNetworkBase<T>.UpdateParametersGpu(IGpuOptimizerConfig)

NeuralNetworkBase<T>.UpdateParametersGpuDeferred(IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferred(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions)

NeuralNetworkBase<T>.TrainBatchGpuDeferredAsync(IGpuTensor<T>, IGpuTensor<T>, IGpuOptimizerConfig, GpuExecutionOptions, CancellationToken)

NeuralNetworkBase<T>.UploadWeightsToGpu()

NeuralNetworkBase<T>.DownloadWeightsFromGpu()

NeuralNetworkBase<T>.ZeroGradientsGpu()

NeuralNetworkBase<T>.ExtractSingleExample(Tensor<T>, int)

NeuralNetworkBase<T>.ForwardWithMemory(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithCheckpointing(Tensor<T>)

NeuralNetworkBase<T>.CanUseGpuResidentPath()

NeuralNetworkBase<T>.TryForwardGpuOptimized(Tensor<T>, out Tensor<T>)

NeuralNetworkBase<T>.ForwardGpu(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferred(Tensor<T>)

NeuralNetworkBase<T>.ForwardDeferredAsync(Tensor<T>, CancellationToken)

NeuralNetworkBase<T>.BeginGpuExecution(GpuExecutionOptions)

NeuralNetworkBase<T>.ForwardWithGpuContext(Tensor<T>)

NeuralNetworkBase<T>.ForwardWithGpuContext(IGpuTensor<T>)

NeuralNetworkBase<T>.GetGpuMemoryStats()

NeuralNetworkBase<T>.ForwardWithFeatures(Tensor<T>, int[])

NeuralNetworkBase<T>.GetParameterCount()

NeuralNetworkBase<T>.InvalidateParameterCountCache()

NeuralNetworkBase<T>.AddLayerToCollection(ILayer<T>)

NeuralNetworkBase<T>.RemoveLayerFromCollection(ILayer<T>)

NeuralNetworkBase<T>.ClearLayers()

NeuralNetworkBase<T>.ValidateCustomLayers(List<ILayer<T>>)

NeuralNetworkBase<T>.ValidateCustomLayersInternal(List<ILayer<T>>)

NeuralNetworkBase<T>.IsValidInputLayer(ILayer<T>)

NeuralNetworkBase<T>.IsValidOutputLayer(ILayer<T>)

NeuralNetworkBase<T>.AreLayersCompatible(ILayer<T>, ILayer<T>)

NeuralNetworkBase<T>.GetParameterGradients()

NeuralNetworkBase<T>.EnsureArchitectureInitialized()

NeuralNetworkBase<T>.SetTrainingMode(bool)

NeuralNetworkBase<T>.EnableMemoryManagement(TrainingMemoryConfig)

NeuralNetworkBase<T>.DisableMemoryManagement()

NeuralNetworkBase<T>.GetMemoryEstimate(int, int)

NeuralNetworkBase<T>.GetLastLoss()

NeuralNetworkBase<T>.ResetState()

NeuralNetworkBase<T>.BackwardWithInputGradient(Tensor<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Vector<T>, Vector<T>)

NeuralNetworkBase<T>.ComputeInputGradient(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.SaveModel(string)

NeuralNetworkBase<T>.LoadModel(string)

NeuralNetworkBase<T>.Serialize()

NeuralNetworkBase<T>.Deserialize(byte[])

NeuralNetworkBase<T>.WithParameters(Vector<T>)

NeuralNetworkBase<T>.GetActiveFeatureIndices()

NeuralNetworkBase<T>.IsFeatureUsed(int)

NeuralNetworkBase<T>.DeepCopy()

NeuralNetworkBase<T>.Clone()

NeuralNetworkBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

NeuralNetworkBase<T>._enabledMethods

NeuralNetworkBase<T>._sensitiveFeatures

NeuralNetworkBase<T>._fairnessMetrics

NeuralNetworkBase<T>._baseModel

NeuralNetworkBase<T>.GetGlobalFeatureImportanceAsync()

NeuralNetworkBase<T>.GetLocalFeatureImportanceAsync(Tensor<T>)

NeuralNetworkBase<T>.GetShapValuesAsync(Tensor<T>)

NeuralNetworkBase<T>.GetLimeExplanationAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetPartialDependenceAsync(Vector<int>, int)

NeuralNetworkBase<T>.GetCounterfactualAsync(Tensor<T>, Tensor<T>, int)

NeuralNetworkBase<T>.GetModelSpecificInterpretabilityAsync()

NeuralNetworkBase<T>.GenerateTextExplanationAsync(Tensor<T>, Tensor<T>)

NeuralNetworkBase<T>.GetFeatureInteractionAsync(int, int)

NeuralNetworkBase<T>.ValidateFairnessAsync(Tensor<T>, int)

NeuralNetworkBase<T>.GetAnchorExplanationAsync(Tensor<T>, T)

NeuralNetworkBase<T>.SetBaseModel<TInput, TOutput>(IFullModel<T, TInput, TOutput>)

NeuralNetworkBase<T>.EnableMethod(params InterpretationMethod[])

NeuralNetworkBase<T>.ConfigureFairness(Vector<int>, params FairnessMetric[])

NeuralNetworkBase<T>.GetNamedLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.GetArchitecture()

NeuralNetworkBase<T>.GetFeatureImportance()

NeuralNetworkBase<T>.SetParameters(Vector<T>)

NeuralNetworkBase<T>.AddLayer(LayerType, int, ActivationFunction)

NeuralNetworkBase<T>.AddConvolutionalLayer(int, int, int, ActivationFunction)

NeuralNetworkBase<T>.AddLSTMLayer(int, bool)

NeuralNetworkBase<T>.AddDropoutLayer(double)

NeuralNetworkBase<T>.AddBatchNormalizationLayer(int, double, double)

NeuralNetworkBase<T>.AddPoolingLayer(int[], PoolingType, int, int?)

NeuralNetworkBase<T>.GetGradients()

NeuralNetworkBase<T>.GetInputShape()

NeuralNetworkBase<T>.GetLayerActivations(Tensor<T>)

NeuralNetworkBase<T>.DefaultLossFunction

NeuralNetworkBase<T>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

NeuralNetworkBase<T>.ApplyGradients(Vector<T>, T)

NeuralNetworkBase<T>.SaveState(Stream)

NeuralNetworkBase<T>.LoadState(Stream)

NeuralNetworkBase<T>.Dispose()

NeuralNetworkBase<T>.Dispose(bool)

NeuralNetworkBase<T>.SupportsJitCompilation

NeuralNetworkBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

NeuralNetworkBase<T>.ConvertLayerToGraph(ILayer<T>, ComputationNode<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

For Beginners: A Siamese Network is a special type of neural network designed to compare two inputs and determine how similar they are to each other.

Imagine you have two photos and want to know if they show the same person. A Siamese Network processes both photos through identical neural networks (like twins, hence the name "Siamese"), creates a compact representation (called an "embedding") of each photo, and then compares these representations to determine similarity.

Common applications include:

Face recognition (are these two faces the same person?)
Signature verification (is this signature authentic?)
Document similarity (how similar are these two texts?)
Product recommendations (finding similar products)

The key advantage of Siamese Networks is that they can learn to recognize similarity even for inputs they've never seen before during training.

Constructors

SiameseNetwork(NeuralNetworkArchitecture<T>, ILossFunction<T>?)

Initializes a new instance of the SiameseNetwork class.

public SiameseNetwork(NeuralNetworkArchitecture<T> architecture, ILossFunction<T>? lossFunction = null)

Parameters

architecture NeuralNetworkArchitecture<T>: The neural network architecture defining the structure of the shared subnetwork.
lossFunction ILossFunction<T>

Remarks

For Beginners: This constructor sets up your Siamese Network with the specified architecture.

The architecture defines the structure of the shared subnetwork that will process each input. The constructor creates:

A shared subnetwork (the identical twin networks)
An output layer that takes the embeddings from both inputs and produces a similarity score

The "embedding size" refers to how many numbers are used to represent each processed input. For example, a face might be represented by 128 numbers that capture its key features.

The sigmoid activation function at the end ensures the output is between 0 and 1, where 0 means "completely different" and 1 means "identical".

Properties

AuxiliaryLossWeight

Gets or sets the weight for the contrastive auxiliary loss.

public T AuxiliaryLossWeight { get; set; }

Property Value

T

Remarks

This weight controls how much contrastive loss contributes to the total loss. Typical values range from 0.1 to 1.0.

For Beginners: This controls how much we encourage good similarity learning.

Common values:

0.5 (default): Balanced contribution
0.1-0.3: Light contrastive emphasis
0.7-1.0: Strong contrastive emphasis

Higher values make the network focus more on learning good embeddings.

ContrastiveMargin

Gets or sets the margin for contrastive loss.

public T ContrastiveMargin { get; set; }

Property Value

T

ParameterCount

Gets the total number of trainable parameters in the Siamese network.

public override int ParameterCount { get; }

Property Value

int

Remarks

For Beginners: This property tells you how many numbers (parameters) define your neural network.

Neural networks learn by adjusting these parameters during training. The parameter count gives you an idea of how complex your model is:

A network with more parameters can potentially learn more complex patterns
A network with too many parameters might "memorize" the training data instead of learning general patterns
More parameters require more training data and computational resources

For example, a Siamese network for face recognition might have millions of parameters to capture all the subtle features that distinguish different faces.

This property adds together:

The number of parameters in the shared subnetwork (which processes each input)
The number of parameters in the output layer (which compares the embeddings)

You might use this information to:

Estimate how much memory your model will need
Compare the complexity of different network architectures
Determine if you have enough training data (typically you want many times more examples than parameters)

UseAuxiliaryLoss

Gets or sets whether auxiliary loss (contrastive/triplet loss) should be used during training.

public bool UseAuxiliaryLoss { get; set; }

Property Value

bool

Remarks

Contrastive loss encourages similar pairs to have small distances and dissimilar pairs to have large distances. Triplet loss ensures that an anchor is closer to positive examples than negative examples by a margin.

For Beginners: This helps the Siamese network learn better similarity representations.

Contrastive loss works like this:

Similar pairs should have embeddings close together
Dissimilar pairs should have embeddings far apart
Formula: L = (1-Y) * 0.5 * D² + Y * 0.5 * max(0, margin - D)² where Y=1 for similar, Y=0 for dissimilar, D=distance

This helps the network:

Learn meaningful similarity measures
Create well-separated embedding spaces
Improve discrimination between similar/dissimilar pairs

Methods

ComputeAuxiliaryLoss()

Computes the auxiliary loss (contrastive loss) for similarity learning.

public T ComputeAuxiliaryLoss()

Returns

T: The computed contrastive auxiliary loss.

Remarks

This method computes contrastive loss to improve embedding quality. Formula: L = (1-Y) * 0.5 * D² + Y * 0.5 * max(0, margin - D)² where Y=1 for similar pairs, Y=0 for dissimilar, D=Euclidean distance

For Beginners: This calculates how well the network separates similar from dissimilar pairs.

Contrastive loss works by:

For similar pairs: Penalize large distances (pull them together)
For dissimilar pairs: Penalize small distances (push them apart)
Use a margin to define "far enough" for dissimilar pairs

This helps because:

Creates well-organized embedding spaces
Similar items cluster together
Dissimilar items stay separated
Improves the network's ability to judge similarity

The auxiliary loss is combined with the main loss during training.

CreateNewInstance()

Creates a new instance of the Siamese network with the same architecture.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>: A new instance of the Siamese network.

Remarks

This method creates a new Siamese network with the same architecture as the current instance. The new instance has freshly initialized parameters and is ready for training.

For Beginners: This creates a brand new Siamese network with the same structure.

Think of it like creating a copy of your current network's blueprint:

It has the same subnetwork structure for processing inputs
It processes the same types of inputs (like images of the same size)
But it starts with fresh, untrained parameters

This is useful when you want to:

Start over with a fresh network but keep the same design
Create multiple networks with identical structures for comparison
Train networks with different data but the same architecture

The new network will need to be trained from scratch, as it doesn't inherit any of the "knowledge" from the original network.

DeserializeNetworkSpecificData(BinaryReader)

Deserializes Siamese network-specific data from a binary reader.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader: The binary reader to read from.

Remarks

This method loads the state of a previously saved Siamese network from a binary stream, reconstructing both the shared subnetwork and the output layer.

For Beginners: This method loads a previously saved Siamese network from a file, restoring all its learned parameters so you can use it without retraining.

GetAuxiliaryLossDiagnostics()

Gets diagnostic information about the contrastive auxiliary loss.

public Dictionary<string, string> GetAuxiliaryLossDiagnostics()

Returns

Dictionary<string, string>: A dictionary containing diagnostic information about contrastive learning.

Remarks

This method returns detailed diagnostics about contrastive loss, including the computed loss value, margin, weight, and configuration parameters. This information is useful for monitoring similarity learning and debugging.

For Beginners: This provides information about how well the network learns similarity.

The diagnostics include:

Total contrastive loss (how well embeddings are organized)
Contrastive margin (minimum distance for dissimilar pairs)
Weight applied to the contrastive loss
Whether contrastive learning is enabled

This helps you:

Monitor embedding quality during training
Debug issues with similarity learning
Understand the impact of contrastive loss on performance

You can use this information to adjust margin and weight for better results.

GetDiagnostics()

Gets diagnostic information about this component's state and behavior. Overrides GetDiagnostics() to include auxiliary loss diagnostics.

public Dictionary<string, string> GetDiagnostics()

Returns

Dictionary<string, string>: A dictionary containing diagnostic metrics including both base layer diagnostics and auxiliary loss diagnostics from GetAuxiliaryLossDiagnostics().

GetModelMetadata()

Gets metadata about the Siamese Network.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>: A ModelMetaData object containing information about the network.

Remarks

This method returns comprehensive metadata about the Siamese network, including information about its architecture, embedding size, and other relevant parameters.

For Beginners: This provides detailed information about your Siamese network, such as the size of embeddings and the structure of the subnetwork. This information is useful for documentation, debugging, and understanding the network's configuration.

InitializeLayers()

Initializes the layers of the neural network.

protected override void InitializeLayers()

Remarks

This method is overridden but empty because the layers are initialized in the constructor.

Predict(Tensor<T>)

Makes a prediction using the Siamese network to compare the similarity between inputs.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>: The input tensor containing pairs to compare. Expected shape: [batchSize, 2, ...dimensions]

Returns

Tensor<T>: The similarity scores between each pair as a tensor with shape [batchSize, 1].

Remarks

The prediction process involves passing each input through the shared subnetwork to generate embeddings, then comparing these embeddings using the output layer to produce similarity scores.

For Beginners: This method takes pairs of inputs and tells you how similar they are to each other. Each input (like an image or text) is processed through the same network to create a compact representation (embedding). These representations are then compared to produce a similarity score between 0 (completely different) and 1 (identical).

SerializeNetworkSpecificData(BinaryWriter)

Serializes Siamese network-specific data to a binary writer.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter: The binary writer to write to.

Remarks

This method saves the state of the Siamese network to a binary stream, including the shared subnetwork and the output layer parameters.

For Beginners: This method saves your trained Siamese network to a file, allowing you to load it later without having to retrain it.

Train(Tensor<T>, Tensor<T>)

Trains the Siamese network on pairs of inputs with their expected similarity.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>: The input tensor containing pairs of items. Expected shape: [batchSize, 2, ...dimensions]
expectedOutput Tensor<T>: The expected similarity scores. Shape: [batchSize, 1]

Remarks

This method trains the Siamese network by processing pairs through the shared subnetwork, calculating the similarity between their embeddings, and updating the network parameters based on the difference between predicted and expected similarity scores.

For Beginners: This method teaches the network to recognize when two inputs are similar. You provide pairs of inputs along with how similar they should be (0 to 1). The network learns to produce embeddings that are close together for similar inputs and far apart for different inputs.

UpdateParameters(Vector<T>)

Updates the network parameters with new values.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>: The vector containing all parameters for the network.

Remarks

For Beginners: This method updates the internal values (weights and biases) of the neural network during training.

The parameters vector contains all the numbers that define how the network processes inputs. These parameters are split into two parts:

Parameters for the shared subnetwork (which processes each input)
Parameters for the output layer (which compares the embeddings)

During training, these parameters are gradually adjusted to make the network better at determining whether two inputs are similar or different.

You typically won't call this method directly - it's used by the training algorithms that optimize the network.

Table of Contents

Class SiameseNetwork<T>

Type Parameters

Remarks

Constructors

SiameseNetwork(NeuralNetworkArchitecture<T>, ILossFunction<T>?)

Parameters

Remarks

Properties

AuxiliaryLossWeight

Property Value

Remarks

ContrastiveMargin

Property Value

ParameterCount

Property Value

Remarks

UseAuxiliaryLoss

Property Value

Remarks

Methods

ComputeAuxiliaryLoss()

Returns

Remarks

CreateNewInstance()

Returns

Remarks

DeserializeNetworkSpecificData(BinaryReader)

Parameters

Remarks

GetAuxiliaryLossDiagnostics()

Returns

Remarks

GetDiagnostics()

Returns

GetModelMetadata()

Returns

Remarks

InitializeLayers()

Remarks

Predict(Tensor<T>)

Parameters

Returns

Remarks

SerializeNetworkSpecificData(BinaryWriter)

Parameters

Remarks

Train(Tensor<T>, Tensor<T>)

Parameters

Remarks

UpdateParameters(Vector<T>)

Parameters

Remarks