Table of Contents

Class TimeEmbeddingLayer<T>

Namespace
AiDotNet.NeuralNetworks.Layers
Assembly
AiDotNet.dll

Represents a time embedding layer that encodes timesteps using sinusoidal embeddings for diffusion models.

public class TimeEmbeddingLayer<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable

Type Parameters

T

The numeric type used for calculations, typically float or double.

Inheritance
TimeEmbeddingLayer<T>
Implements
Inherited Members

Remarks

The time embedding layer converts scalar timesteps into high-dimensional embeddings using sinusoidal functions, similar to positional encodings in transformers. This embedding is then projected through a small MLP to produce the final time conditioning vector used in diffusion U-Net blocks.

For Beginners: In diffusion models, the network needs to know "what time step are we at?"

  • At early timesteps (t near 0), images are clean and noise is minimal
  • At late timesteps (t near T), images are mostly noise
  • The network needs this information to know how much denoising to apply

This layer encodes the timestep number into a rich vector representation that:

  1. Uses sine and cosine functions at different frequencies (sinusoidal encoding)
  2. Passes through a small neural network (MLP) to learn task-specific representations
  3. Gets injected into every ResNet block of the U-Net

The sinusoidal encoding is inspired by transformer positional encodings:

  • Low frequencies capture coarse time information
  • High frequencies capture fine-grained time details

Constructors

TimeEmbeddingLayer(int, int, int)

Initializes a new instance of the TimeEmbeddingLayer<T> class.

public TimeEmbeddingLayer(int embeddingDim, int outputDim, int maxTimestep = 1000)

Parameters

embeddingDim int

The dimension of the sinusoidal embedding (typically model_dim / 4).

outputDim int

The dimension of the output embedding (typically model_dim * 4).

maxTimestep int

Maximum timestep value for normalization. Default: 1000.

Remarks

For Beginners: Create a time embedding layer with specified dimensions.

Common configurations:

  • embeddingDim = 64, outputDim = 256 for small models
  • embeddingDim = 128, outputDim = 512 for medium models
  • embeddingDim = 320, outputDim = 1280 for Stable Diffusion scale

The layer consists of:

  1. Sinusoidal encoding: timestep -> [embeddingDim] features
  2. Linear1 + SiLU: [embeddingDim] -> [outputDim]
  3. Linear2: [outputDim] -> [outputDim]

Properties

SupportsGpuExecution

Gets whether this layer has a GPU execution implementation for inference.

protected override bool SupportsGpuExecution { get; }

Property Value

bool

Remarks

Override this to return true when the layer implements ForwardGpu(params IGpuTensor<T>[]). The actual CanExecuteOnGpu property combines this with engine availability.

For Beginners: This flag indicates if the layer has GPU code for the forward pass. Set this to true in derived classes that implement ForwardGpu.

SupportsJitCompilation

Gets whether this layer supports JIT compilation.

public override bool SupportsJitCompilation { get; }

Property Value

bool

SupportsTraining

Gets a value indicating whether this layer supports training.

public override bool SupportsTraining { get; }

Property Value

bool

Methods

Backward(Tensor<T>)

Performs the backward pass of the time embedding layer.

public override Tensor<T> Backward(Tensor<T> outputGradient)

Parameters

outputGradient Tensor<T>

The gradient of the loss with respect to the layer's output.

Returns

Tensor<T>

The gradient of the loss with respect to the layer's input (timesteps).

BackwardGpu(IGpuTensor<T>)

Performs the GPU-resident backward pass of the time embedding layer.

public override IGpuTensor<T> BackwardGpu(IGpuTensor<T> outputGradient)

Parameters

outputGradient IGpuTensor<T>

The GPU tensor containing the gradient of the loss with respect to the layer's output.

Returns

IGpuTensor<T>

The gradient of the loss with respect to the timestep input (typically zeros since sinusoidal embedding is fixed).

ExportComputationGraph(List<ComputationNode<T>>)

Exports the layer as a computation graph for JIT compilation.

public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)

Parameters

inputNodes List<ComputationNode<T>>

List of input nodes (expects one node containing timesteps).

Returns

ComputationNode<T>

A computation node representing the time embedding output.

Remarks

This method builds a computation graph for the time embedding: 1. Sinusoidal embedding of timesteps 2. First linear layer (matrix multiply + bias) 3. SiLU/Swish activation 4. Second linear layer (matrix multiply + bias)

Forward(Tensor<T>)

Performs the forward pass of the time embedding layer.

public override Tensor<T> Forward(Tensor<T> input)

Parameters

input Tensor<T>

Input tensor containing timesteps. Shape: [batch] or [batch, 1].

Returns

Tensor<T>

Time embedding tensor with shape [batch, outputDim].

ForwardGpu(params IGpuTensor<T>[])

Performs the forward pass of the layer on GPU.

public override IGpuTensor<T> ForwardGpu(params IGpuTensor<T>[] inputs)

Parameters

inputs IGpuTensor<T>[]

The GPU-resident input tensor(s).

Returns

IGpuTensor<T>

The GPU-resident output tensor.

Remarks

This method performs the layer's forward computation entirely on GPU. The input and output tensors remain in GPU memory, avoiding expensive CPU-GPU transfers.

For Beginners: This is like Forward() but runs on the graphics card.

The key difference:

  • Forward() uses CPU tensors that may be copied to/from GPU
  • ForwardGpu() keeps everything on GPU the whole time

Override this in derived classes that support GPU acceleration.

Exceptions

NotSupportedException

Thrown when the layer does not support GPU execution.

GetParameters()

Gets all trainable parameters of the layer as a single vector.

public override Vector<T> GetParameters()

Returns

Vector<T>

ResetState()

Resets the internal state of the layer.

public override void ResetState()

SetParameters(Vector<T>)

Sets all trainable parameters of the layer from a vector.

public override void SetParameters(Vector<T> parameters)

Parameters

parameters Vector<T>

UpdateParameters(T)

Updates the parameters of the layer using the calculated gradients.

public override void UpdateParameters(T learningRate)

Parameters

learningRate T

The learning rate to use for the parameter updates.