Class MusicGenOptions

Namespace: AiDotNet.Audio.MusicGen

Assembly: AiDotNet.dll

Configuration options for MusicGen text-to-music generation.

public class MusicGenOptions

Inheritance: object

MusicGenOptions

Inherited Members: object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

MusicGen is Meta's state-of-the-art music generation model that creates high-quality music from text descriptions. It uses a single-stage transformer language model operating over EnCodec audio codes.

For Beginners: MusicGen generates actual music from descriptions:

Example prompts:

"Upbeat electronic dance music with heavy bass"
"Calm acoustic guitar melody with soft drums"
"Epic orchestral piece with dramatic strings"
"Lo-fi hip hop beats for studying"

Tips for good prompts:

Be specific about genre, instruments, and mood
Include tempo hints (fast, slow, moderate)
Mention energy level (energetic, calm, building)

Properties

CodebookSize

Gets or sets the codebook vocabulary size.

public int CodebookSize { get; set; }

Property Value

int

Remarks

Must match the EnCodec model configuration. Default of 2048 is standard for MusicGen.

DropoutRate

Gets or sets the dropout rate for training.

public double DropoutRate { get; set; }

Property Value

double

DurationSeconds

Gets or sets the default duration of generated music in seconds.

public double DurationSeconds { get; set; }

Property Value

double

EnCodecDecoderPath

Gets or sets the path to the EnCodec decoder ONNX model.

public string? EnCodecDecoderPath { get; set; }

Property Value

string

GuidanceScale

Gets or sets the classifier-free guidance scale.

public double GuidanceScale { get; set; }

Property Value

double

Remarks

Controls how closely the model follows the text prompt: - Low (1.0-2.0): More variation, less prompt adherence - Default (3.0): Good balance - High (4.0-7.0): Stricter prompt following, less creativity

LanguageModelPath

Gets or sets the path to the language model ONNX model.

public string? LanguageModelPath { get; set; }

Property Value

string

MaxDurationSeconds

Gets or sets the maximum duration in seconds.

public double MaxDurationSeconds { get; set; }

Property Value

double

Remarks

MusicGen can generate up to 30 seconds of audio. Longer durations require more memory and compute time.

MaxTextLength

Gets or sets the maximum text sequence length.

public int MaxTextLength { get; set; }

Property Value

int

ModelSize

Gets or sets the model size variant.

public MusicGenModelSize ModelSize { get; set; }

Property Value

MusicGenModelSize

Remarks

Different sizes trade off quality vs speed. Default is Medium which balances both well.

NumCodebooks

Gets or sets the number of EnCodec codebooks to use.

public int NumCodebooks { get; set; }

Property Value

int

Remarks

More codebooks = higher quality but slower generation. Default of 4 is used by standard MusicGen models.

OnnxOptions

Gets or sets the ONNX execution options.

public OnnxModelOptions OnnxOptions { get; set; }

Property Value

OnnxModelOptions

SampleRate

Gets or sets the output sample rate in Hz.

public int SampleRate { get; set; }

Property Value

int

Remarks

MusicGen uses 32kHz by default, matching the EnCodec codec. This produces high-quality audio suitable for music.

Seed

Gets or sets the random seed for reproducibility.

public int? Seed { get; set; }

Property Value

int?

Remarks

Set to a specific value to generate the same music each time. Null for random generation.

Stereo

Gets or sets whether to generate stereo audio.

public bool Stereo { get; set; }

Property Value

bool

Remarks

Requires the Stereo model variant for best results. When using non-stereo models, mono output is duplicated to stereo.

Temperature

Gets or sets the sampling temperature.

public double Temperature { get; set; }

Property Value

double

Remarks

Controls randomness in generation: - Lower (0.5-0.8): More predictable, stable output - Default (1.0): Balanced creativity - Higher (1.2-2.0): More creative but potentially less coherent

TextEncoderPath

Gets or sets the path to the text encoder ONNX model.

public string? TextEncoderPath { get; set; }

Property Value

string

TopK

Gets or sets the top-k sampling parameter.

public int TopK { get; set; }

Property Value

int

Remarks

Limits sampling to the top K most likely tokens. Default of 250 works well for diverse music generation.

TopP

Gets or sets the top-p (nucleus) sampling parameter.

public double TopP { get; set; }

Property Value

double

Remarks

If greater than 0, uses nucleus sampling instead of top-k. Value of 0.0 disables nucleus sampling (uses top-k only).

UseDelayPattern

Gets or sets whether to use the delay pattern.

public bool UseDelayPattern { get; set; }

Property Value

bool

Remarks

The delay pattern is MusicGen's key innovation: - Generates codebooks with temporal offset - Reduces effective sequence length - Improves generation efficiency

Should be true for standard MusicGen operation.

Table of Contents

Class MusicGenOptions

Remarks

Properties

CodebookSize

Property Value

Remarks

DropoutRate

Property Value

DurationSeconds

Property Value

EnCodecDecoderPath

Property Value

GuidanceScale

Property Value

Remarks

LanguageModelPath

Property Value

MaxDurationSeconds

Property Value

Remarks

MaxTextLength

Property Value

ModelSize

Property Value

Remarks

NumCodebooks

Property Value

Remarks

OnnxOptions

Property Value

SampleRate

Property Value

Remarks

Seed

Property Value

Remarks

Stereo

Property Value

Remarks

Temperature

Property Value

Remarks

TextEncoderPath

Property Value

TopK

Property Value

Remarks

TopP

Property Value

Remarks

UseDelayPattern

Property Value

Remarks