Interface IVocabulary
- Namespace
- AiDotNet.Tokenization.Interfaces
- Assembly
- AiDotNet.dll
Interface for vocabulary management.
public interface IVocabulary
Properties
IdToToken
Gets the ID-to-token mapping.
IReadOnlyDictionary<int, string> IdToToken { get; }
Property Value
Size
Gets the vocabulary size.
int Size { get; }
Property Value
TokenToId
Gets the token-to-ID mapping.
IReadOnlyDictionary<string, int> TokenToId { get; }
Property Value
Methods
AddToken(string)
Adds a token to the vocabulary.
int AddToken(string token)
Parameters
tokenstringThe token to add.
Returns
- int
The token ID.
AddTokens(IEnumerable<string>)
Adds multiple tokens to the vocabulary.
void AddTokens(IEnumerable<string> tokens)
Parameters
tokensIEnumerable<string>The tokens to add.
Clear()
Clears the vocabulary.
void Clear()
ContainsId(int)
Checks if a token ID exists in the vocabulary.
bool ContainsId(int id)
Parameters
idintThe token ID to check.
Returns
- bool
True if the token ID exists, false otherwise.
ContainsToken(string)
Checks if a token exists in the vocabulary.
bool ContainsToken(string token)
Parameters
tokenstringThe token to check.
Returns
- bool
True if the token exists, false otherwise.
GetAllTokens()
Gets all tokens in the vocabulary.
IEnumerable<string> GetAllTokens()
Returns
- IEnumerable<string>
All tokens.
GetToken(int)
Gets the token for a given token ID.
string? GetToken(int id)
Parameters
idintThe token ID.
Returns
- string
The token, or null if not found.
GetTokenId(string)
Gets the token ID for a given token.
int GetTokenId(string token)
Parameters
tokenstringThe token.
Returns
- int
The token ID, or the unknown token ID if not found.