Table of Contents

Class TokenizationResult

Namespace
AiDotNet.Tokenization.Models
Assembly
AiDotNet.dll

Represents the result of tokenizing text, including token IDs, tokens, and attention masks.

public class TokenizationResult
Inheritance
TokenizationResult
Inherited Members

Constructors

TokenizationResult()

Creates an empty tokenization result.

public TokenizationResult()

TokenizationResult(List<string>, List<int>)

Creates a tokenization result with the specified tokens and IDs.

public TokenizationResult(List<string> tokens, List<int> tokenIds)

Parameters

tokens List<string>
tokenIds List<int>

Exceptions

ArgumentException

Thrown when tokens and tokenIds have different counts.

Properties

AttentionMask

Gets or sets the attention mask (1 for real tokens, 0 for padding).

public List<int> AttentionMask { get; set; }

Property Value

List<int>

Length

Gets the number of tokens (excluding padding).

public int Length { get; }

Property Value

int

Metadata

Gets or sets additional metadata.

public Dictionary<string, object> Metadata { get; set; }

Property Value

Dictionary<string, object>

Offsets

Gets or sets character-level offsets for each token.

public List<(int Start, int End)> Offsets { get; set; }

Property Value

List<(int, int)>

PositionIds

Gets or sets the position IDs for positional embeddings.

public List<int> PositionIds { get; set; }

Property Value

List<int>

TokenIds

Gets or sets the token IDs.

public List<int> TokenIds { get; set; }

Property Value

List<int>

TokenTypeIds

Gets or sets the token type IDs (for models that support multiple segments).

public List<int> TokenTypeIds { get; set; }

Property Value

List<int>

Tokens

Gets or sets the actual tokens (subword strings).

public List<string> Tokens { get; set; }

Property Value

List<string>

TotalLength

Gets the total number of token IDs (including padding).

public int TotalLength { get; }

Property Value

int