Class UpBlock<T>
Upsampling block for VAE decoder with transposed convolution and multiple ResBlocks.
public class UpBlock<T> : LayerBase<T>, ILayer<T>, IJitCompilable<T>, IDiagnosticsProvider, IWeightLoadable<T>, IDisposable
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
LayerBase<T>UpBlock<T>
- Implements
-
ILayer<T>
- Inherited Members
Remarks
This implements an upsampling block following the Stable Diffusion VAE architecture: - Transposed convolution (deconvolution) to increase spatial dimensions by 2x - Multiple VAEResBlocks to process features at the upsampled resolution
For Beginners: An UpBlock is like a decompression stage in a decoder.
What it does:
- Increases spatial size by 2x using transposed convolution (decompression)
- Processes the upsampled features through multiple residual blocks
Example: 8x8 input -> 16x16 output (spatial dimensions doubled)
Why use transposed convolution instead of simple interpolation?
- Transposed conv is learnable (the network decides how to upsample)
- Simple interpolation (bilinear, nearest) has fixed behavior
- Learnable upsampling can generate sharper details
Structure:
input [B, C_in, H, W]
│
├─→ ConvTranspose (stride=2) ─→ upsample
│
↓
[B, C_out, 2*H, 2*W]
│
├─→ ResBlock → ResBlock → ... (numLayers blocks)
│
↓
output [B, C_out, 2*H, 2*W]
Constructors
UpBlock(int, int, int, int, int, bool)
Initializes a new instance of the UpBlock class.
public UpBlock(int inChannels, int outChannels, int numLayers = 2, int numGroups = 32, int inputSpatialSize = 8, bool hasUpsample = true)
Parameters
inChannelsintNumber of input channels.
outChannelsintNumber of output channels.
numLayersintNumber of residual blocks (default: 2).
numGroupsintNumber of groups for GroupNorm (default: 32).
inputSpatialSizeintSpatial dimensions at input (default: 8).
hasUpsampleboolWhether to include upsampling (default: true).
Remarks
For Beginners: Create an upsampling block for the VAE decoder.
Parameters explained:
- inChannels/outChannels: Feature depth before/after this block
- numLayers: More layers = more feature processing but slower
- hasUpsample: Set to false for the first decoder block to keep resolution
Typical usage in a decoder (mirror of encoder):
- Block 1: 512 -> 512, no upsample (8x8 -> 8x8)
- Block 2: 512 -> 256, upsample (8x8 -> 16x16)
- Block 3: 256 -> 128, upsample (16x16 -> 32x32)
- Block 4: 128 -> 128, upsample (32x32 -> 64x64)
Properties
HasUpsample
Gets whether this block performs upsampling.
public bool HasUpsample { get; }
Property Value
InputChannels
Gets the number of input channels.
public int InputChannels { get; }
Property Value
NumLayers
Gets the number of residual blocks.
public int NumLayers { get; }
Property Value
OutputChannels
Gets the number of output channels.
public int OutputChannels { get; }
Property Value
SupportsJitCompilation
Gets whether this layer supports JIT compilation.
public override bool SupportsJitCompilation { get; }
Property Value
- bool
True if the layer can be JIT compiled, false otherwise.
Remarks
This property indicates whether the layer has implemented ExportComputationGraph() and can benefit from JIT compilation. All layers MUST implement this property.
For Beginners: JIT compilation can make inference 5-10x faster by converting the layer's operations into optimized native code.
Layers should return false if they:
- Have not yet implemented a working ExportComputationGraph()
- Use dynamic operations that change based on input data
- Are too simple to benefit from JIT compilation
When false, the layer will use the standard Forward() method instead.
SupportsTraining
Gets a value indicating whether this layer supports training.
public override bool SupportsTraining { get; }
Property Value
- bool
trueif the layer has trainable parameters and supports backpropagation; otherwise,false.
Remarks
This property indicates whether the layer can be trained through backpropagation. Layers with trainable parameters such as weights and biases typically return true, while layers that only perform fixed transformations (like pooling or activation layers) typically return false.
For Beginners: This property tells you if the layer can learn from data.
A value of true means:
- The layer has parameters that can be adjusted during training
- It will improve its performance as it sees more data
- It participates in the learning process
A value of false means:
- The layer doesn't have any adjustable parameters
- It performs the same operation regardless of training
- It doesn't need to learn (but may still be useful)
Methods
Backward(Tensor<T>)
Performs the backward pass through the up block.
public override Tensor<T> Backward(Tensor<T> outputGradient)
Parameters
outputGradientTensor<T>Gradient of loss with respect to output.
Returns
- Tensor<T>
Gradient of loss with respect to input.
Deserialize(BinaryReader)
Loads the block's state from a binary reader.
public override void Deserialize(BinaryReader reader)
Parameters
readerBinaryReader
ExportComputationGraph(List<ComputationNode<T>>)
Exports the layer's computation graph for JIT compilation.
public override ComputationNode<T> ExportComputationGraph(List<ComputationNode<T>> inputNodes)
Parameters
inputNodesList<ComputationNode<T>>List to populate with input computation nodes.
Returns
- ComputationNode<T>
The output computation node representing the layer's operation.
Remarks
This method constructs a computation graph representation of the layer's forward pass that can be JIT compiled for faster inference. All layers MUST implement this method to support JIT compilation.
For Beginners: JIT (Just-In-Time) compilation converts the layer's operations into optimized native code for 5-10x faster inference.
To support JIT compilation, a layer must:
- Implement this method to export its computation graph
- Set SupportsJitCompilation to true
- Use ComputationNode and TensorOperations to build the graph
All layers are required to implement this method, even if they set SupportsJitCompilation = false.
Forward(Tensor<T>)
Performs the forward pass through the up block.
public override Tensor<T> Forward(Tensor<T> input)
Parameters
inputTensor<T>Input tensor with shape [batch, inChannels, H, W].
Returns
- Tensor<T>
Output tensor with shape [batch, outChannels, 2H, 2W] if hasUpsample, else [batch, outChannels, H, W].
GetParameters()
Gets all trainable parameters as a single vector.
public override Vector<T> GetParameters()
Returns
- Vector<T>
GetResBlocks()
Gets the residual blocks for external access.
public IReadOnlyList<VAEResBlock<T>> GetResBlocks()
Returns
- IReadOnlyList<VAEResBlock<T>>
Array of residual blocks.
ResetState()
Resets the internal state of the block.
public override void ResetState()
Serialize(BinaryWriter)
Saves the block's state to a binary writer.
public override void Serialize(BinaryWriter writer)
Parameters
writerBinaryWriter
SetParameters(Vector<T>)
Sets all trainable parameters from a single vector.
public override void SetParameters(Vector<T> parameters)
Parameters
parametersVector<T>
UpdateParameters(T)
Updates all learnable parameters using gradient descent.
public override void UpdateParameters(T learningRate)
Parameters
learningRateTThe learning rate for the update.