Class SiameseNetwork<T>
- Namespace
- AiDotNet.NeuralNetworks
- Assembly
- AiDotNet.dll
Implements a Siamese Neural Network for comparing pairs of inputs and determining their similarity.
public class SiameseNetwork<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable, IAuxiliaryLossLayer<T>, IDiagnosticsProvider
Type Parameters
TThe numeric type used for calculations (e.g., double, float).
- Inheritance
-
SiameseNetwork<T>
- Implements
- Inherited Members
- Extension Methods
Remarks
For Beginners: A Siamese Network is a special type of neural network designed to compare two inputs and determine how similar they are to each other.
Imagine you have two photos and want to know if they show the same person. A Siamese Network processes both photos through identical neural networks (like twins, hence the name "Siamese"), creates a compact representation (called an "embedding") of each photo, and then compares these representations to determine similarity.
Common applications include:
- Face recognition (are these two faces the same person?)
- Signature verification (is this signature authentic?)
- Document similarity (how similar are these two texts?)
- Product recommendations (finding similar products)
The key advantage of Siamese Networks is that they can learn to recognize similarity even for inputs they've never seen before during training.
Constructors
SiameseNetwork(NeuralNetworkArchitecture<T>, ILossFunction<T>?)
Initializes a new instance of the SiameseNetwork class.
public SiameseNetwork(NeuralNetworkArchitecture<T> architecture, ILossFunction<T>? lossFunction = null)
Parameters
architectureNeuralNetworkArchitecture<T>The neural network architecture defining the structure of the shared subnetwork.
lossFunctionILossFunction<T>
Remarks
For Beginners: This constructor sets up your Siamese Network with the specified architecture.
The architecture defines the structure of the shared subnetwork that will process each input. The constructor creates:
- A shared subnetwork (the identical twin networks)
- An output layer that takes the embeddings from both inputs and produces a similarity score
The "embedding size" refers to how many numbers are used to represent each processed input. For example, a face might be represented by 128 numbers that capture its key features.
The sigmoid activation function at the end ensures the output is between 0 and 1, where 0 means "completely different" and 1 means "identical".
Properties
AuxiliaryLossWeight
Gets or sets the weight for the contrastive auxiliary loss.
public T AuxiliaryLossWeight { get; set; }
Property Value
- T
Remarks
This weight controls how much contrastive loss contributes to the total loss. Typical values range from 0.1 to 1.0.
For Beginners: This controls how much we encourage good similarity learning.
Common values:
- 0.5 (default): Balanced contribution
- 0.1-0.3: Light contrastive emphasis
- 0.7-1.0: Strong contrastive emphasis
Higher values make the network focus more on learning good embeddings.
ContrastiveMargin
Gets or sets the margin for contrastive loss.
public T ContrastiveMargin { get; set; }
Property Value
- T
ParameterCount
Gets the total number of trainable parameters in the Siamese network.
public override int ParameterCount { get; }
Property Value
Remarks
For Beginners: This property tells you how many numbers (parameters) define your neural network.
Neural networks learn by adjusting these parameters during training. The parameter count gives you an idea of how complex your model is:
- A network with more parameters can potentially learn more complex patterns
- A network with too many parameters might "memorize" the training data instead of learning general patterns
- More parameters require more training data and computational resources
For example, a Siamese network for face recognition might have millions of parameters to capture all the subtle features that distinguish different faces.
This property adds together:
- The number of parameters in the shared subnetwork (which processes each input)
- The number of parameters in the output layer (which compares the embeddings)
You might use this information to:
- Estimate how much memory your model will need
- Compare the complexity of different network architectures
- Determine if you have enough training data (typically you want many times more examples than parameters)
UseAuxiliaryLoss
Gets or sets whether auxiliary loss (contrastive/triplet loss) should be used during training.
public bool UseAuxiliaryLoss { get; set; }
Property Value
Remarks
Contrastive loss encourages similar pairs to have small distances and dissimilar pairs to have large distances. Triplet loss ensures that an anchor is closer to positive examples than negative examples by a margin.
For Beginners: This helps the Siamese network learn better similarity representations.
Contrastive loss works like this:
- Similar pairs should have embeddings close together
- Dissimilar pairs should have embeddings far apart
- Formula: L = (1-Y) * 0.5 * D² + Y * 0.5 * max(0, margin - D)² where Y=1 for similar, Y=0 for dissimilar, D=distance
This helps the network:
- Learn meaningful similarity measures
- Create well-separated embedding spaces
- Improve discrimination between similar/dissimilar pairs
Methods
ComputeAuxiliaryLoss()
Computes the auxiliary loss (contrastive loss) for similarity learning.
public T ComputeAuxiliaryLoss()
Returns
- T
The computed contrastive auxiliary loss.
Remarks
This method computes contrastive loss to improve embedding quality. Formula: L = (1-Y) * 0.5 * D² + Y * 0.5 * max(0, margin - D)² where Y=1 for similar pairs, Y=0 for dissimilar, D=Euclidean distance
For Beginners: This calculates how well the network separates similar from dissimilar pairs.
Contrastive loss works by:
- For similar pairs: Penalize large distances (pull them together)
- For dissimilar pairs: Penalize small distances (push them apart)
- Use a margin to define "far enough" for dissimilar pairs
This helps because:
- Creates well-organized embedding spaces
- Similar items cluster together
- Dissimilar items stay separated
- Improves the network's ability to judge similarity
The auxiliary loss is combined with the main loss during training.
CreateNewInstance()
Creates a new instance of the Siamese network with the same architecture.
protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()
Returns
- IFullModel<T, Tensor<T>, Tensor<T>>
A new instance of the Siamese network.
Remarks
This method creates a new Siamese network with the same architecture as the current instance. The new instance has freshly initialized parameters and is ready for training.
For Beginners: This creates a brand new Siamese network with the same structure.
Think of it like creating a copy of your current network's blueprint:
- It has the same subnetwork structure for processing inputs
- It processes the same types of inputs (like images of the same size)
- But it starts with fresh, untrained parameters
This is useful when you want to:
- Start over with a fresh network but keep the same design
- Create multiple networks with identical structures for comparison
- Train networks with different data but the same architecture
The new network will need to be trained from scratch, as it doesn't inherit any of the "knowledge" from the original network.
DeserializeNetworkSpecificData(BinaryReader)
Deserializes Siamese network-specific data from a binary reader.
protected override void DeserializeNetworkSpecificData(BinaryReader reader)
Parameters
readerBinaryReaderThe binary reader to read from.
Remarks
This method loads the state of a previously saved Siamese network from a binary stream, reconstructing both the shared subnetwork and the output layer.
For Beginners: This method loads a previously saved Siamese network from a file, restoring all its learned parameters so you can use it without retraining.
GetAuxiliaryLossDiagnostics()
Gets diagnostic information about the contrastive auxiliary loss.
public Dictionary<string, string> GetAuxiliaryLossDiagnostics()
Returns
- Dictionary<string, string>
A dictionary containing diagnostic information about contrastive learning.
Remarks
This method returns detailed diagnostics about contrastive loss, including the computed loss value, margin, weight, and configuration parameters. This information is useful for monitoring similarity learning and debugging.
For Beginners: This provides information about how well the network learns similarity.
The diagnostics include:
- Total contrastive loss (how well embeddings are organized)
- Contrastive margin (minimum distance for dissimilar pairs)
- Weight applied to the contrastive loss
- Whether contrastive learning is enabled
This helps you:
- Monitor embedding quality during training
- Debug issues with similarity learning
- Understand the impact of contrastive loss on performance
You can use this information to adjust margin and weight for better results.
GetDiagnostics()
Gets diagnostic information about this component's state and behavior. Overrides GetDiagnostics() to include auxiliary loss diagnostics.
public Dictionary<string, string> GetDiagnostics()
Returns
- Dictionary<string, string>
A dictionary containing diagnostic metrics including both base layer diagnostics and auxiliary loss diagnostics from GetAuxiliaryLossDiagnostics().
GetModelMetadata()
Gets metadata about the Siamese Network.
public override ModelMetadata<T> GetModelMetadata()
Returns
- ModelMetadata<T>
A ModelMetaData object containing information about the network.
Remarks
This method returns comprehensive metadata about the Siamese network, including information about its architecture, embedding size, and other relevant parameters.
For Beginners: This provides detailed information about your Siamese network, such as the size of embeddings and the structure of the subnetwork. This information is useful for documentation, debugging, and understanding the network's configuration.
InitializeLayers()
Initializes the layers of the neural network.
protected override void InitializeLayers()
Remarks
This method is overridden but empty because the layers are initialized in the constructor.
Predict(Tensor<T>)
Makes a prediction using the Siamese network to compare the similarity between inputs.
public override Tensor<T> Predict(Tensor<T> input)
Parameters
inputTensor<T>The input tensor containing pairs to compare. Expected shape: [batchSize, 2, ...dimensions]
Returns
- Tensor<T>
The similarity scores between each pair as a tensor with shape [batchSize, 1].
Remarks
The prediction process involves passing each input through the shared subnetwork to generate embeddings, then comparing these embeddings using the output layer to produce similarity scores.
For Beginners: This method takes pairs of inputs and tells you how similar they are to each other. Each input (like an image or text) is processed through the same network to create a compact representation (embedding). These representations are then compared to produce a similarity score between 0 (completely different) and 1 (identical).
SerializeNetworkSpecificData(BinaryWriter)
Serializes Siamese network-specific data to a binary writer.
protected override void SerializeNetworkSpecificData(BinaryWriter writer)
Parameters
writerBinaryWriterThe binary writer to write to.
Remarks
This method saves the state of the Siamese network to a binary stream, including the shared subnetwork and the output layer parameters.
For Beginners: This method saves your trained Siamese network to a file, allowing you to load it later without having to retrain it.
Train(Tensor<T>, Tensor<T>)
Trains the Siamese network on pairs of inputs with their expected similarity.
public override void Train(Tensor<T> input, Tensor<T> expectedOutput)
Parameters
inputTensor<T>The input tensor containing pairs of items. Expected shape: [batchSize, 2, ...dimensions]
expectedOutputTensor<T>The expected similarity scores. Shape: [batchSize, 1]
Remarks
This method trains the Siamese network by processing pairs through the shared subnetwork, calculating the similarity between their embeddings, and updating the network parameters based on the difference between predicted and expected similarity scores.
For Beginners: This method teaches the network to recognize when two inputs are similar. You provide pairs of inputs along with how similar they should be (0 to 1). The network learns to produce embeddings that are close together for similar inputs and far apart for different inputs.
UpdateParameters(Vector<T>)
Updates the network parameters with new values.
public override void UpdateParameters(Vector<T> parameters)
Parameters
parametersVector<T>The vector containing all parameters for the network.
Remarks
For Beginners: This method updates the internal values (weights and biases) of the neural network during training.
The parameters vector contains all the numbers that define how the network processes inputs. These parameters are split into two parts:
- Parameters for the shared subnetwork (which processes each input)
- Parameters for the output layer (which compares the embeddings)
During training, these parameters are gradually adjusted to make the network better at determining whether two inputs are similar or different.
You typically won't call this method directly - it's used by the training algorithms that optimize the network.