Interface ITextDetector<T>

Namespace: AiDotNet.Document.Interfaces

Assembly: AiDotNet.dll

Interface for text detection models that locate text regions in images.

public interface ITextDetector<T> : IDocumentModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T: The numeric type used for calculations.

Inherited Members: IDocumentModel<T>.ExpectedImageSize

IDocumentModel<T>.MaxSequenceLength

IDocumentModel<T>.RequiresOCR

IDocumentModel<T>.SupportedDocumentTypes

IDocumentModel<T>.IsOnnxMode

IDocumentModel<T>.EncodeDocument(Tensor<T>)

IDocumentModel<T>.ValidateInputShape(Tensor<T>)

IDocumentModel<T>.GetModelSummary()

IFullModel<T, Tensor<T>, Tensor<T>>.DefaultLossFunction

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>.Train(Tensor<T>, Tensor<T>)

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>.Predict(Tensor<T>)

IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>.GetModelMetadata()

IModelSerializer.Serialize()

IModelSerializer.Deserialize(byte[])

IModelSerializer.SaveModel(string)

IModelSerializer.LoadModel(string)

ICheckpointableModel.SaveState(Stream)

ICheckpointableModel.LoadState(Stream)

IParameterizable<T, Tensor<T>, Tensor<T>>.GetParameters()

IParameterizable<T, Tensor<T>, Tensor<T>>.SetParameters(Vector<T>)

IParameterizable<T, Tensor<T>, Tensor<T>>.ParameterCount

IParameterizable<T, Tensor<T>, Tensor<T>>.WithParameters(Vector<T>)

IFeatureAware.GetActiveFeatureIndices()

IFeatureAware.SetActiveFeatureIndices(IEnumerable<int>)

IFeatureAware.IsFeatureUsed(int)

IFeatureImportance<T>.GetFeatureImportance()

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>.DeepCopy()

ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>.Clone()

IGradientComputable<T, Tensor<T>, Tensor<T>>.ComputeGradients(Tensor<T>, Tensor<T>, ILossFunction<T>)

IGradientComputable<T, Tensor<T>, Tensor<T>>.ApplyGradients(Vector<T>, T)

IJitCompilable<T>.ExportComputationGraph(List<ComputationNode<T>>)

IJitCompilable<T>.SupportsJitCompilation

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

Text detection models find where text appears in an image without reading the text. They output bounding boxes (polygons or rectangles) around text regions.

For Beginners: Text detection is the first step in reading text from images. It's like highlighting all the places where text appears, but not actually reading it. After detection, a text recognizer reads the actual characters in each highlighted region.

Example usage:

var detector = new DBNet<float>(architecture);
var result = detector.DetectText(documentImage);
foreach (var region in result.TextRegions)
{
    Console.WriteLine($"Found text at: {region.BoundingBox}");
}

Properties

MinTextHeight

Gets the minimum detectable text height in pixels.

int MinTextHeight { get; }

Property Value

int

SupportsPolygonOutput

Gets whether this detector outputs polygon bounding boxes (vs axis-aligned rectangles).

bool SupportsPolygonOutput { get; }

Property Value

bool

SupportsRotatedText

Gets whether this detector supports rotated text detection.

bool SupportsRotatedText { get; }

Property Value

bool

Methods

DetectText(Tensor<T>)

Detects text regions in an image.

TextDetectionResult<T> DetectText(Tensor<T> image)

Parameters

image Tensor<T>: The input image tensor.

Returns

TextDetectionResult<T>: Detection result with text region locations.

DetectText(Tensor<T>, double)

Detects text regions with a custom confidence threshold.

TextDetectionResult<T> DetectText(Tensor<T> image, double confidenceThreshold)

Parameters

image Tensor<T>: The input image tensor.
confidenceThreshold double: Minimum confidence for a detection (0-1).

Returns

TextDetectionResult<T>: Detection result with text region locations.

GetProbabilityMap(Tensor<T>)

Gets the probability map showing text likelihood at each pixel.

Tensor<T> GetProbabilityMap(Tensor<T> image)

Parameters

image Tensor<T>: The input image tensor.

Returns

Tensor<T>: Probability map tensor with same spatial dimensions as input.

Table of Contents

Interface ITextDetector<T>

Type Parameters

Remarks

Properties

MinTextHeight

Property Value

SupportsPolygonOutput

Property Value

SupportsRotatedText

Property Value

Methods

DetectText(Tensor<T>)

Parameters

Returns

DetectText(Tensor<T>, double)

Parameters

Returns

GetProbabilityMap(Tensor<T>)

Parameters

Returns