Table of Contents

Interface ITableExtractor<T>

Namespace
AiDotNet.Document.Interfaces
Assembly
AiDotNet.dll

Interface for table detection and structure recognition models.

public interface ITableExtractor<T> : IDocumentModel<T>, IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T

The numeric type used for calculations.

Inherited Members
Extension Methods

Remarks

Table extraction models detect tables in documents and extract their structure (rows, columns, cells) along with the content in each cell.

For Beginners: Documents often contain tables with important data. Table extraction helps computers understand where tables are, how they're structured, and what data they contain. This is useful for extracting financial data, product catalogs, or any tabular information.

Example usage:

var tables = tableExtractor.DetectTables(documentImage);
foreach (var table in tables)
{
    var structure = tableExtractor.RecognizeStructure(table.Image);
    Console.WriteLine($"Table with {structure.NumRows} rows and {structure.NumColumns} columns");
}

Properties

SupportsBorderedTables

Gets whether this model supports bordered tables.

bool SupportsBorderedTables { get; }

Property Value

bool

SupportsBorderlessTables

Gets whether this model supports borderless tables.

bool SupportsBorderlessTables { get; }

Property Value

bool

SupportsMergedCells

Gets whether this model can detect merged cells (row/column spans).

bool SupportsMergedCells { get; }

Property Value

bool

Methods

DetectTables(Tensor<T>)

Detects tables in a document image.

IEnumerable<TableRegion<T>> DetectTables(Tensor<T> documentImage)

Parameters

documentImage Tensor<T>

The document image tensor.

Returns

IEnumerable<TableRegion<T>>

List of detected table regions with bounding boxes.

ExportTables(Tensor<T>, TableExportFormat)

Exports detected tables to a specific format.

string ExportTables(Tensor<T> documentImage, TableExportFormat format)

Parameters

documentImage Tensor<T>

The document image tensor.

format TableExportFormat

Output format (CSV, JSON, HTML, Markdown, Excel).

Returns

string

Serialized table data in the specified format.

ExtractTableContent(Tensor<T>)

Extracts table content as structured data.

IEnumerable<List<List<string>>> ExtractTableContent(Tensor<T> documentImage)

Parameters

documentImage Tensor<T>

The document image tensor.

Returns

IEnumerable<List<List<string>>>

Tables as lists of rows, where each row is a list of cell contents.

Remarks

This is a convenience method that combines table detection, structure recognition, and OCR to return the final table content.

RecognizeStructure(Tensor<T>)

Recognizes the structure of a table (rows, columns, cells).

TableStructureResult<T> RecognizeStructure(Tensor<T> tableImage)

Parameters

tableImage Tensor<T>

Cropped table image tensor (from DetectTables).

Returns

TableStructureResult<T>

Table structure with cell positions and spans.