Enum MidiTokenizer.TokenizationStrategy
- Namespace
- AiDotNet.Tokenization.Specialized
- Assembly
- AiDotNet.dll
MIDI tokenization strategies that control how musical notes are converted to tokens.
public enum MidiTokenizer.TokenizationStrategy
Fields
CPWord = 1Compound Word (CPWord): Combines note attributes into single compound tokens. More compact vocabulary, better for sequence models with limited context windows.
REMI = 0Revamped MIDI (REMI): Position, Bar, Pitch, Velocity, Duration as separate tokens. Most expressive strategy, preserves all timing and dynamic information.
SimpleNote = 2Simple Note: Basic pitch-duration pairs without velocity or position tracking. Simplest representation, ideal for melody extraction and basic music generation.
Remarks
For Beginners: Choose your strategy based on your use case: - Use REMI for tasks requiring full musical expression (composition, arrangement) - Use CPWord for models that benefit from smaller vocabularies (faster training) - Use SimpleNote for melody-focused tasks where dynamics don't matter