Class OCRBase<T>
- Namespace
- AiDotNet.ComputerVision.OCR
- Assembly
- AiDotNet.dll
Base class for OCR models.
public abstract class OCRBase<T>
Type Parameters
TThe numeric type used for calculations.
- Inheritance
-
OCRBase<T>
- Derived
- Inherited Members
Constructors
OCRBase(OCROptions<T>)
Creates a new OCR model.
protected OCRBase(OCROptions<T> options)
Parameters
optionsOCROptions<T>
Fields
CharToIndex
Character to index mapping.
protected readonly Dictionary<char, int> CharToIndex
Field Value
DefaultCharacterSet
Default character set for recognition.
protected static readonly string DefaultCharacterSet
Field Value
IndexToChar
Index to character mapping.
protected readonly Dictionary<int, char> IndexToChar
Field Value
NumOps
protected readonly INumericOperations<T> NumOps
Field Value
- INumericOperations<T>
Options
protected readonly OCROptions<T> Options
Field Value
- OCROptions<T>
Properties
Name
Name of this OCR model.
public abstract string Name { get; }
Property Value
VocabularySize
Gets the vocabulary size (number of classes).
public int VocabularySize { get; }
Property Value
Methods
ComputeConfidence(Tensor<T>, string)
Computes confidence from logits.
protected T ComputeConfidence(Tensor<T> logits, string decodedText)
Parameters
logitsTensor<T>decodedTextstring
Returns
- T
DecodeAttention(Tensor<T>, int)
Decodes attention-based output to text.
protected string DecodeAttention(Tensor<T> logits, int endTokenId)
Parameters
logitsTensor<T>endTokenIdint
Returns
DecodeCTC(Tensor<T>)
Decodes CTC output to text.
protected string DecodeCTC(Tensor<T> logits)
Parameters
logitsTensor<T>
Returns
GetParameterCount()
Gets the total parameter count.
public abstract long GetParameterCount()
Returns
LoadWeightsAsync(string, CancellationToken)
Loads pretrained weights.
public abstract Task LoadWeightsAsync(string pathOrUrl, CancellationToken cancellationToken = default)
Parameters
pathOrUrlstringcancellationTokenCancellationToken
Returns
PreprocessCrop(Tensor<T>)
Preprocesses a text crop for recognition.
protected virtual Tensor<T> PreprocessCrop(Tensor<T> crop)
Parameters
cropTensor<T>
Returns
- Tensor<T>
Recognize(Tensor<T>)
Recognizes text in an image.
public abstract OCRResult<T> Recognize(Tensor<T> image)
Parameters
imageTensor<T>Input image tensor [batch, channels, height, width].
Returns
- OCRResult<T>
OCR result with recognized text.
RecognizeText(Tensor<T>)
Recognizes text in a cropped text region.
public abstract (string text, T confidence) RecognizeText(Tensor<T> croppedImage)
Parameters
croppedImageTensor<T>Cropped text region tensor.
Returns
- (string Label, T Confidence)
Recognized text and confidence.
ResizeBilinear(Tensor<T>, int, int)
Resizes tensor using bilinear interpolation.
protected Tensor<T> ResizeBilinear(Tensor<T> input, int targetH, int targetW)
Parameters
Returns
- Tensor<T>
SaveWeights(string)
Saves model weights.
public abstract void SaveWeights(string path)
Parameters
pathstring