Table of Contents

Class Pix2Pix<T>

Namespace
AiDotNet.NeuralNetworks
Assembly
AiDotNet.dll

Represents a Pix2Pix GAN for paired image-to-image translation tasks.

public class Pix2Pix<T> : NeuralNetworkBase<T>, INeuralNetworkModel<T>, INeuralNetwork<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>, IInterpretableModel<T>, IInputGradientComputable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations.

Inheritance
Pix2Pix<T>
Implements
IFullModel<T, Tensor<T>, Tensor<T>>
IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>
IParameterizable<T, Tensor<T>, Tensor<T>>
ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>
IGradientComputable<T, Tensor<T>, Tensor<T>>
Inherited Members
Extension Methods

Remarks

Pix2Pix is a conditional GAN for paired image-to-image translation: - Uses a U-Net generator with skip connections - Uses a PatchGAN discriminator that classifies image patches - Combines adversarial loss with L1 reconstruction loss - Requires paired training data (input-output pairs) - Works for various tasks: edges to photo, day to night, sketch to image, etc.

For Beginners: Pix2Pix transforms one type of image to another.

Key features:

  • Learns from paired examples (input A becomes output B)
  • Generator: U-Net architecture preserves spatial information
  • Discriminator: PatchGAN focuses on local image patches
  • Loss: Both "looks real" and "matches input"

Example use cases:

  • Convert sketches to realistic photos
  • Colorize black-and-white images
  • Transform day scenes to night
  • Semantic labels to photorealistic images
  • Map to satellite image

Reference: Isola et al., "Image-to-Image Translation with Conditional Adversarial Networks" (2017)

Constructors

Pix2Pix(NeuralNetworkArchitecture<T>, NeuralNetworkArchitecture<T>, InputType, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>?, ILossFunction<T>?, double)

Initializes a new instance of the Pix2Pix<T> class.

public Pix2Pix(NeuralNetworkArchitecture<T> generatorArchitecture, NeuralNetworkArchitecture<T> discriminatorArchitecture, InputType inputType, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? generatorOptimizer = null, IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>? discriminatorOptimizer = null, ILossFunction<T>? lossFunction = null, double l1Lambda = 100)

Parameters

generatorArchitecture NeuralNetworkArchitecture<T>

U-Net generator architecture.

discriminatorArchitecture NeuralNetworkArchitecture<T>

PatchGAN discriminator architecture.

inputType InputType

Input type.

generatorOptimizer IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>

Optional optimizer for the generator. If null, Adam optimizer is used.

discriminatorOptimizer IGradientBasedOptimizer<T, Tensor<T>, Tensor<T>>

Optional optimizer for the discriminator. If null, Adam optimizer is used.

lossFunction ILossFunction<T>

Optional loss function.

l1Lambda double

L1 loss coefficient. Default is 100.0.

Remarks

The Pix2Pix constructor initializes both the generator and discriminator networks along with their respective optimizers. The L1 lambda coefficient controls how strongly the output should match the target image.

For Beginners: This sets up Pix2Pix with sensible defaults.

Key parameters:

  • Generator/discriminator architectures define the network structures
  • Optimizers control how the networks learn
  • L1 lambda (100.0) controls how closely output matches target

Properties

Discriminator

Gets the PatchGAN discriminator network.

public ConvolutionalNeuralNetwork<T> Discriminator { get; }

Property Value

ConvolutionalNeuralNetwork<T>

Remarks

PatchGAN classifies whether each N x N patch in an image is real or fake, rather than classifying the entire image. This encourages sharp high-frequency details and works well for image-to-image translation.

For Beginners: Discriminator checks local image quality.

Instead of:

  • "Is the whole image real?" (standard discriminator)

PatchGAN asks:

  • "Is this patch real? Is that patch real?" (many local checks)
  • This catches more detailed mistakes
  • Results in sharper, more realistic outputs

Generator

Gets the U-Net generator network.

public ConvolutionalNeuralNetwork<T> Generator { get; }

Property Value

ConvolutionalNeuralNetwork<T>

ParameterCount

Gets the total number of trainable parameters in the Pix2Pix model.

public override int ParameterCount { get; }

Property Value

int

Methods

CreateNewInstance()

Creates a new instance of the same type as this neural network.

protected override IFullModel<T, Tensor<T>, Tensor<T>> CreateNewInstance()

Returns

IFullModel<T, Tensor<T>, Tensor<T>>

A new instance of the same neural network type.

Remarks

For Beginners: This creates a blank version of the same type of neural network.

It's used internally by methods like DeepCopy and Clone to create the right type of network before copying the data into it.

DeserializeNetworkSpecificData(BinaryReader)

Deserializes network-specific data that was not covered by the general deserialization process.

protected override void DeserializeNetworkSpecificData(BinaryReader reader)

Parameters

reader BinaryReader

The BinaryReader to read the data from.

Remarks

This method is called at the end of the general deserialization process to allow derived classes to read any additional data specific to their implementation.

For Beginners: Continuing the suitcase analogy, this is like unpacking that special compartment. After the main deserialization method has unpacked the common items (layers, parameters), this method allows each specific type of neural network to unpack its own unique items that were stored during serialization.

GetModelMetadata()

Gets the metadata for this neural network model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>

A ModelMetaData object containing information about the model.

InitializeLayers()

Initializes the layers of the neural network based on the architecture.

protected override void InitializeLayers()

Remarks

For Beginners: This method sets up all the layers in your neural network according to the architecture you've defined. It's like assembling the parts of your network before you can use it.

Predict(Tensor<T>)

Makes a prediction using the neural network.

public override Tensor<T> Predict(Tensor<T> input)

Parameters

input Tensor<T>

The input data to process.

Returns

Tensor<T>

The network's prediction.

Remarks

For Beginners: This is the main method you'll use to get results from your trained neural network. You provide some input data (like an image or text), and the network processes it through all its layers to produce an output (like a classification or prediction).

ResetOptimizerState()

Resets both optimizer states for a fresh training run.

public void ResetOptimizerState()

SerializeNetworkSpecificData(BinaryWriter)

Serializes network-specific data that is not covered by the general serialization process.

protected override void SerializeNetworkSpecificData(BinaryWriter writer)

Parameters

writer BinaryWriter

The BinaryWriter to write the data to.

Remarks

This method is called at the end of the general serialization process to allow derived classes to write any additional data specific to their implementation.

For Beginners: Think of this as packing a special compartment in your suitcase. While the main serialization method packs the common items (layers, parameters), this method allows each specific type of neural network to pack its own unique items that other networks might not have.

Train(Tensor<T>, Tensor<T>)

Trains the neural network on a single input-output pair.

public override void Train(Tensor<T> input, Tensor<T> expectedOutput)

Parameters

input Tensor<T>

The input data.

expectedOutput Tensor<T>

The expected output for the given input.

Remarks

This method performs one training step on the neural network using the provided input and expected output. It updates the network's parameters to reduce the error between the network's prediction and the expected output.

For Beginners: This is how your neural network learns. You provide: - An input (what the network should process) - The expected output (what the correct answer should be)

The network then:

  1. Makes a prediction based on the input
  2. Compares its prediction to the expected output
  3. Calculates how wrong it was (the loss)
  4. Adjusts its internal values to do better next time

After training, you can get the loss value using the GetLastLoss() method to see how well the network is learning.

TrainStep(Tensor<T>, Tensor<T>)

Performs one training step for Pix2Pix.

public (T discriminatorLoss, T generatorLoss, T l1Loss) TrainStep(Tensor<T> inputImages, Tensor<T> targetImages)

Parameters

inputImages Tensor<T>

Input images (e.g., sketches, semantic maps).

targetImages Tensor<T>

Target output images (e.g., photos).

Returns

(T Precision, T Recall, T F1Score)

Tuple of (discriminator loss, generator loss, L1 loss).

Remarks

This method implements the Pix2Pix training algorithm: 1. Train discriminator on real and fake image pairs 2. Train generator with combined adversarial and L1 loss 3. The discriminator learns to distinguish real from fake 4. The generator learns to both fool the discriminator and match the target

For Beginners: One training round for Pix2Pix.

The training process:

  • Discriminator learns to spot fake images
  • Generator learns to create realistic images that match target
  • L1 loss ensures output closely matches expected result
  • Returns loss values for monitoring progress

Translate(Tensor<T>)

Translates input images to output images.

public Tensor<T> Translate(Tensor<T> inputImages)

Parameters

inputImages Tensor<T>

The input images to translate.

Returns

Tensor<T>

The translated output images.

UpdateParameters(Vector<T>)

Updates the parameters of all networks in the Pix2Pix GAN.

public override void UpdateParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

The new parameters vector containing parameters for all networks.