Class MusicGenOptions
Configuration options for MusicGen text-to-music generation.
public class MusicGenOptions
- Inheritance
-
MusicGenOptions
- Inherited Members
Remarks
MusicGen is Meta's state-of-the-art music generation model that creates high-quality music from text descriptions. It uses a single-stage transformer language model operating over EnCodec audio codes.
For Beginners: MusicGen generates actual music from descriptions:
Example prompts:
- "Upbeat electronic dance music with heavy bass"
- "Calm acoustic guitar melody with soft drums"
- "Epic orchestral piece with dramatic strings"
- "Lo-fi hip hop beats for studying"
Tips for good prompts:
- Be specific about genre, instruments, and mood
- Include tempo hints (fast, slow, moderate)
- Mention energy level (energetic, calm, building)
Properties
CodebookSize
Gets or sets the codebook vocabulary size.
public int CodebookSize { get; set; }
Property Value
Remarks
Must match the EnCodec model configuration. Default of 2048 is standard for MusicGen.
DropoutRate
Gets or sets the dropout rate for training.
public double DropoutRate { get; set; }
Property Value
DurationSeconds
Gets or sets the default duration of generated music in seconds.
public double DurationSeconds { get; set; }
Property Value
EnCodecDecoderPath
Gets or sets the path to the EnCodec decoder ONNX model.
public string? EnCodecDecoderPath { get; set; }
Property Value
GuidanceScale
Gets or sets the classifier-free guidance scale.
public double GuidanceScale { get; set; }
Property Value
Remarks
Controls how closely the model follows the text prompt: - Low (1.0-2.0): More variation, less prompt adherence - Default (3.0): Good balance - High (4.0-7.0): Stricter prompt following, less creativity
LanguageModelPath
Gets or sets the path to the language model ONNX model.
public string? LanguageModelPath { get; set; }
Property Value
MaxDurationSeconds
Gets or sets the maximum duration in seconds.
public double MaxDurationSeconds { get; set; }
Property Value
Remarks
MusicGen can generate up to 30 seconds of audio. Longer durations require more memory and compute time.
MaxTextLength
Gets or sets the maximum text sequence length.
public int MaxTextLength { get; set; }
Property Value
ModelSize
Gets or sets the model size variant.
public MusicGenModelSize ModelSize { get; set; }
Property Value
Remarks
Different sizes trade off quality vs speed. Default is Medium which balances both well.
NumCodebooks
Gets or sets the number of EnCodec codebooks to use.
public int NumCodebooks { get; set; }
Property Value
Remarks
More codebooks = higher quality but slower generation. Default of 4 is used by standard MusicGen models.
OnnxOptions
Gets or sets the ONNX execution options.
public OnnxModelOptions OnnxOptions { get; set; }
Property Value
SampleRate
Gets or sets the output sample rate in Hz.
public int SampleRate { get; set; }
Property Value
Remarks
MusicGen uses 32kHz by default, matching the EnCodec codec. This produces high-quality audio suitable for music.
Seed
Gets or sets the random seed for reproducibility.
public int? Seed { get; set; }
Property Value
- int?
Remarks
Set to a specific value to generate the same music each time. Null for random generation.
Stereo
Gets or sets whether to generate stereo audio.
public bool Stereo { get; set; }
Property Value
Remarks
Requires the Stereo model variant for best results. When using non-stereo models, mono output is duplicated to stereo.
Temperature
Gets or sets the sampling temperature.
public double Temperature { get; set; }
Property Value
Remarks
Controls randomness in generation: - Lower (0.5-0.8): More predictable, stable output - Default (1.0): Balanced creativity - Higher (1.2-2.0): More creative but potentially less coherent
TextEncoderPath
Gets or sets the path to the text encoder ONNX model.
public string? TextEncoderPath { get; set; }
Property Value
TopK
Gets or sets the top-k sampling parameter.
public int TopK { get; set; }
Property Value
Remarks
Limits sampling to the top K most likely tokens. Default of 250 works well for diverse music generation.
TopP
Gets or sets the top-p (nucleus) sampling parameter.
public double TopP { get; set; }
Property Value
Remarks
If greater than 0, uses nucleus sampling instead of top-k. Value of 0.0 disables nucleus sampling (uses top-k only).
UseDelayPattern
Gets or sets whether to use the delay pattern.
public bool UseDelayPattern { get; set; }
Property Value
Remarks
The delay pattern is MusicGen's key innovation: - Generates codebooks with temporal offset - Reduces effective sequence length - Improves generation efficiency
Should be true for standard MusicGen operation.