Table of Contents

Interface IAutoregressiveMultimodalModel<T>

Namespace
AiDotNet.Interfaces
Assembly
AiDotNet.dll

Defines the contract for autoregressive multimodal generation models that can generate tokens from any modality in an interleaved fashion.

public interface IAutoregressiveMultimodalModel<T> : IUnifiedMultimodalModel<T>

Type Parameters

T

The numeric type used for calculations.

Inherited Members

Remarks

This interface represents models like CM3Leon, Chameleon, or similar that use a unified vocabulary across all modalities and generate content token-by-token regardless of modality.

Properties

ModalityTokenCounts

Gets the number of tokens reserved for each modality.

IReadOnlyDictionary<ModalityType, int> ModalityTokenCounts { get; }

Property Value

IReadOnlyDictionary<ModalityType, int>

VocabularySize

Gets the vocabulary size (includes all modality tokens).

int VocabularySize { get; }

Property Value

int

Methods

ComputeLoss(IEnumerable<MultimodalInput<T>>, IEnumerable<MultimodalOutput<T>>)

Computes the loss for next-token prediction.

T ComputeLoss(IEnumerable<MultimodalInput<T>> inputs, IEnumerable<MultimodalOutput<T>> targets)

Parameters

inputs IEnumerable<MultimodalInput<T>>

Input sequence.

targets IEnumerable<MultimodalOutput<T>>

Target outputs.

Returns

T

Cross-entropy loss.

Detokenize(IEnumerable<int>)

Detokenizes token IDs back to multimodal outputs.

IEnumerable<MultimodalOutput<T>> Detokenize(IEnumerable<int> tokenIds)

Parameters

tokenIds IEnumerable<int>

Token IDs to decode.

Returns

IEnumerable<MultimodalOutput<T>>

Decoded multimodal outputs.

GenerateNextToken(IEnumerable<MultimodalInput<T>>, double)

Generates the next token given the context.

(int TokenId, ModalityType Modality) GenerateNextToken(IEnumerable<MultimodalInput<T>> context, double temperature = 1)

Parameters

context IEnumerable<MultimodalInput<T>>

Previous tokens/inputs.

temperature double

Sampling temperature.

Returns

(int TokenId, ModalityType Modality)

Next token ID and its modality.

GetNextTokenLogits(IEnumerable<MultimodalInput<T>>)

Gets token probabilities for next position.

Vector<T> GetNextTokenLogits(IEnumerable<MultimodalInput<T>> context)

Parameters

context IEnumerable<MultimodalInput<T>>

Context sequence.

Returns

Vector<T>

Log probabilities for all tokens.

Tokenize(MultimodalInput<T>)

Tokenizes input into unified vocabulary tokens.

IEnumerable<int> Tokenize(MultimodalInput<T> input)

Parameters

input MultimodalInput<T>

Input to tokenize.

Returns

IEnumerable<int>

Token IDs in the unified vocabulary.