Table of Contents

Class ObjectDetectorBase<T>

Namespace
AiDotNet.ComputerVision.Detection.ObjectDetection
Assembly
AiDotNet.dll

Base class for all object detection models.

public abstract class ObjectDetectorBase<T>

Type Parameters

T

The numeric type used for calculations.

Inheritance
ObjectDetectorBase<T>
Derived
Inherited Members

Remarks

For Beginners: An object detector takes an image and finds all objects in it, returning their locations (bounding boxes), types (class labels), and confidence scores. This base class provides the common structure and methods that all detection models share.

A typical detector has three parts: - Backbone: Extracts features from the image - Neck: Combines features at multiple scales - Head: Produces final predictions (boxes, classes, scores)

Constructors

ObjectDetectorBase(ObjectDetectionOptions<T>)

Creates a new object detector with the specified options.

protected ObjectDetectorBase(ObjectDetectionOptions<T> options)

Parameters

options ObjectDetectionOptions<T>

Configuration options for the detector.

Fields

IsTrainingMode

Whether the model is in training mode.

protected bool IsTrainingMode

Field Value

bool

Nms

NMS algorithm for removing duplicate detections.

protected readonly NMS<T> Nms

Field Value

NMS<T>

NumOps

Numeric operations for type T.

protected readonly INumericOperations<T> NumOps

Field Value

INumericOperations<T>

Options

Configuration options for this detector.

protected readonly ObjectDetectionOptions<T> Options

Field Value

ObjectDetectionOptions<T>

WeightDownloader

Weight downloader for fetching pre-trained weights.

protected readonly WeightDownloader WeightDownloader

Field Value

WeightDownloader

Properties

Backbone

The backbone network for feature extraction.

protected BackboneBase<T>? Backbone { get; set; }

Property Value

BackboneBase<T>

ClassNames

Class names for detection labels.

public string[] ClassNames { get; protected set; }

Property Value

string[]

Name

Name of this detector architecture.

public abstract string Name { get; }

Property Value

string

Neck

The neck module for feature fusion.

protected NeckBase<T>? Neck { get; set; }

Property Value

NeckBase<T>

Methods

Detect(Tensor<T>)

Detects objects in an image.

public virtual DetectionResult<T> Detect(Tensor<T> image)

Parameters

image Tensor<T>

Input image tensor with shape [batch, channels, height, width].

Returns

DetectionResult<T>

Detection results for each image in the batch.

Remarks

For Beginners: This is the main method you call to detect objects. Pass in an image (as a tensor) and get back a list of detected objects with their bounding boxes, class labels, and confidence scores.

Detect(Tensor<T>, double, double)

Detects objects in an image with custom thresholds.

public abstract DetectionResult<T> Detect(Tensor<T> image, double confidenceThreshold, double nmsThreshold)

Parameters

image Tensor<T>

Input image tensor.

confidenceThreshold double

Minimum confidence to keep a detection.

nmsThreshold double

IoU threshold for NMS.

Returns

DetectionResult<T>

Detection results.

DetectBatch(Tensor<T>)

Detects objects in a batch of images.

public virtual BatchDetectionResult<T> DetectBatch(Tensor<T> images)

Parameters

images Tensor<T>

Batch of images with shape [batch, channels, height, width].

Returns

BatchDetectionResult<T>

Detection results for each image.

DetectBatch(Tensor<T>, double, double)

Detects objects in a batch of images with custom thresholds.

public virtual BatchDetectionResult<T> DetectBatch(Tensor<T> images, double confidenceThreshold, double nmsThreshold)

Parameters

images Tensor<T>

Batch of images.

confidenceThreshold double

Minimum confidence.

nmsThreshold double

NMS threshold.

Returns

BatchDetectionResult<T>

Batch detection results.

ExtractBatchItem(Tensor<T>, int)

Extracts a single image from a batch.

protected Tensor<T> ExtractBatchItem(Tensor<T> batch, int index)

Parameters

batch Tensor<T>

Batch of images.

index int

Index of the image to extract.

Returns

Tensor<T>

Single image tensor.

ExtractBatchOutputs(List<Tensor<T>>, int)

Extracts the outputs for a single batch item from batch outputs.

protected virtual List<Tensor<T>> ExtractBatchOutputs(List<Tensor<T>> batchOutputs, int batchIndex)

Parameters

batchOutputs List<Tensor<T>>
batchIndex int

Returns

List<Tensor<T>>

Forward(Tensor<T>)

Performs forward pass through the network.

protected abstract List<Tensor<T>> Forward(Tensor<T> input)

Parameters

input Tensor<T>

Input image tensor.

Returns

List<Tensor<T>>

Raw network outputs before post-processing.

GetCocoClassNames()

Gets the default COCO class names.

protected static string[] GetCocoClassNames()

Returns

string[]

Array of 80 COCO class names.

GetHeadParameterCount()

Gets the number of parameters in the detection head.

protected abstract long GetHeadParameterCount()

Returns

long

Number of parameters.

GetParameterCount()

Gets the total number of parameters in the model.

public virtual long GetParameterCount()

Returns

long

Number of trainable parameters.

LoadPretrainedWeightsAsync(CancellationToken)

Loads default pre-trained weights for this architecture and size.

public virtual Task LoadPretrainedWeightsAsync(CancellationToken cancellationToken = default)

Parameters

cancellationToken CancellationToken

Cancellation token.

Returns

Task

LoadWeightsAsync(string, CancellationToken)

Loads pre-trained weights from a file or URL.

public abstract Task LoadWeightsAsync(string pathOrUrl, CancellationToken cancellationToken = default)

Parameters

pathOrUrl string

Local file path or URL to weights.

cancellationToken CancellationToken

Cancellation token.

Returns

Task

Normalize(Tensor<T>)

Normalizes image values to [0, 1] range.

protected virtual Tensor<T> Normalize(Tensor<T> image)

Parameters

image Tensor<T>

Input image with values [0, 255].

Returns

Tensor<T>

Normalized image.

PostProcess(List<Tensor<T>>, int, int, double, double)

Post-processes raw network outputs into detections.

protected abstract List<Detection<T>> PostProcess(List<Tensor<T>> outputs, int imageWidth, int imageHeight, double confidenceThreshold, double nmsThreshold)

Parameters

outputs List<Tensor<T>>

Raw network outputs.

imageWidth int

Original image width.

imageHeight int

Original image height.

confidenceThreshold double

Minimum confidence threshold.

nmsThreshold double

NMS IoU threshold.

Returns

List<Detection<T>>

List of detections after NMS.

Preprocess(Tensor<T>)

Preprocesses an image for input to the network.

protected virtual Tensor<T> Preprocess(Tensor<T> image)

Parameters

image Tensor<T>

Raw image tensor.

Returns

Tensor<T>

Preprocessed tensor ready for the network.

ResizeImage(Tensor<T>, int, int)

Resizes an image tensor to the specified dimensions.

protected virtual Tensor<T> ResizeImage(Tensor<T> image, int targetHeight, int targetWidth)

Parameters

image Tensor<T>

Input image.

targetHeight int

Target height.

targetWidth int

Target width.

Returns

Tensor<T>

Resized image.

SaveWeights(string)

Saves model weights to a file.

public abstract void SaveWeights(string path)

Parameters

path string

File path to save weights.

SetTrainingMode(bool)

Sets the model to training or inference mode.

public virtual void SetTrainingMode(bool training)

Parameters

training bool

True for training mode, false for inference.