Class SwinTransformer<T>

Namespace: AiDotNet.ComputerVision.Detection.Backbones

Assembly: AiDotNet.dll

Swin Transformer backbone for hierarchical vision transformer feature extraction.

public class SwinTransformer<T> : BackboneBase<T>

Type Parameters

T: The numeric type used for calculations.

Inheritance: object

BackboneBase<T>

SwinTransformer<T>

Inherited Members: BackboneBase<T>.NumOps

BackboneBase<T>.IsTrainingMode

BackboneBase<T>.IsFrozen

BackboneBase<T>.SetTrainingMode(bool)

BackboneBase<T>.Freeze()

BackboneBase<T>.Unfreeze()

BackboneBase<T>.GetExpectedInputSize()

BackboneBase<T>.ValidateInput(Tensor<T>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

For Beginners: Swin Transformer is a hierarchical vision transformer that uses shifted windows for efficient attention computation. Unlike ViT which processes the entire image at once, Swin processes local windows and shifts them between layers for cross-window connections.

Key features: - Hierarchical structure with patch merging (like CNN stages) - Window-based multi-head self-attention for efficiency - Shifted window partitioning for cross-window connections

Reference: Liu et al., "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows", ICCV 2021

Constructors

SwinTransformer(SwinVariant, int, int)

Creates a new Swin Transformer backbone.

public SwinTransformer(SwinVariant variant = SwinVariant.SwinTiny, int windowSize = 7, int inChannels = 3)

Parameters

variant SwinVariant: Swin variant (Tiny, Small, Base, Large).
windowSize int: Window size for attention (default 7).
inChannels int: Number of input channels (default 3 for RGB).

Properties

Name

Name of this backbone architecture.

public override string Name { get; }

Property Value

string

OutputChannels

Number of output channels for each feature level.

public override int[] OutputChannels { get; }

Property Value

int[]

Remarks

Modern detectors use multi-scale features. This array contains the number of channels at each scale, typically from high resolution (small objects) to low resolution (large objects).

Strides

The stride (downsampling factor) at each feature level.

public override int[] Strides { get; }

Property Value

int[]

Remarks

A stride of 8 means the feature map is 1/8 the size of the input. Common strides are [8, 16, 32] for 3-level feature pyramids.

Methods

ExtractFeatures(Tensor<T>)

Extracts multi-scale features from an input image tensor.

public override List<Tensor<T>> ExtractFeatures(Tensor<T> input)

Parameters

input Tensor<T>: Input image tensor with shape [batch, channels, height, width].

Returns

List<Tensor<T>>: List of feature maps at different scales, from highest to lowest resolution.

Remarks

For Beginners: This method runs the input image through the backbone and returns feature maps at multiple scales. Small objects need high-resolution features, while large objects are detected in low-resolution features.

GetParameterCount()

Gets the total number of parameters in the backbone.

public override long GetParameterCount()

Returns

long: Number of trainable parameters.

ReadParameters(BinaryReader)

Reads parameters from a binary reader for deserialization.

public override void ReadParameters(BinaryReader reader)

Parameters

reader BinaryReader: The binary reader to read from.

WriteParameters(BinaryWriter)

Writes all parameters to a binary writer for serialization.

public override void WriteParameters(BinaryWriter writer)

Parameters

writer BinaryWriter: The binary writer to write to.

Table of Contents

Class SwinTransformer<T>

Type Parameters

Remarks

Constructors

SwinTransformer(SwinVariant, int, int)

Parameters

Properties

Name

Property Value

OutputChannels

Property Value

Remarks

Strides

Property Value

Remarks

Methods

ExtractFeatures(Tensor<T>)

Parameters

Returns

Remarks

GetParameterCount()

Returns

ReadParameters(BinaryReader)

Parameters

WriteParameters(BinaryWriter)

Parameters