Table of Contents

Enum MidiTokenizer.TokenizationStrategy

Namespace
AiDotNet.Tokenization.Specialized
Assembly
AiDotNet.dll

MIDI tokenization strategies that control how musical notes are converted to tokens.

public enum MidiTokenizer.TokenizationStrategy

Fields

CPWord = 1

Compound Word (CPWord): Combines note attributes into single compound tokens. More compact vocabulary, better for sequence models with limited context windows.

REMI = 0

Revamped MIDI (REMI): Position, Bar, Pitch, Velocity, Duration as separate tokens. Most expressive strategy, preserves all timing and dynamic information.

SimpleNote = 2

Simple Note: Basic pitch-duration pairs without velocity or position tracking. Simplest representation, ideal for melody extraction and basic music generation.

Remarks

For Beginners: Choose your strategy based on your use case: - Use REMI for tasks requiring full musical expression (composition, arrangement) - Use CPWord for models that benefit from smaller vocabularies (faster training) - Use SimpleNote for melody-focused tasks where dynamics don't matter