Class WhisperOptions
Configuration options for the Whisper speech recognition model.
public class WhisperOptions
- Inheritance
-
WhisperOptions
- Inherited Members
Remarks
Whisper is a speech recognition model developed by OpenAI that can transcribe audio in multiple languages and perform translation.
For Beginners: Whisper comes in different sizes (tiny to large). Smaller models are faster but less accurate. Larger models are more accurate but slower.
- Tiny: ~39M parameters, fastest, good for quick transcription
- Base: ~74M parameters, balanced speed/accuracy
- Small: ~244M parameters, good accuracy
- Medium: ~769M parameters, high accuracy
- Large: ~1.5B parameters, best accuracy, slow
Properties
BeamSize
Gets or sets the beam size for beam search decoding. Higher values give better results but are slower.
public int BeamSize { get; set; }
Property Value
DecoderModelPath
Gets or sets the path to the decoder ONNX model. If null, the model will be downloaded automatically.
public string? DecoderModelPath { get; set; }
Property Value
EncoderModelPath
Gets or sets the path to the encoder ONNX model. If null, the model will be downloaded automatically.
public string? EncoderModelPath { get; set; }
Property Value
Language
Gets or sets the language code for transcription (e.g., "en", "es", "fr"). Null for auto-detection.
public string? Language { get; set; }
Property Value
MaxAudioLengthSeconds
Gets or sets the maximum length of audio to process in seconds. Whisper processes 30-second chunks.
public int MaxAudioLengthSeconds { get; set; }
Property Value
MaxTokens
Gets or sets the maximum number of tokens to generate.
public int MaxTokens { get; set; }
Property Value
ModelSize
Gets or sets the model size to use.
public WhisperModelSize ModelSize { get; set; }
Property Value
NumMels
Gets or sets the number of mel filterbank channels. Whisper uses 80 mel channels.
public int NumMels { get; set; }
Property Value
OnnxOptions
Gets or sets the ONNX execution options.
public OnnxModelOptions OnnxOptions { get; set; }
Property Value
ReturnTimestamps
Gets or sets whether to return timestamps with the transcription.
public bool ReturnTimestamps { get; set; }
Property Value
SampleRate
Gets or sets the sample rate expected by the model. Whisper expects 16kHz audio.
public int SampleRate { get; set; }
Property Value
Temperature
Gets or sets the temperature for sampling. Lower values make output more deterministic.
public double Temperature { get; set; }
Property Value
Translate
Gets or sets whether to translate to English. If true, non-English audio will be translated to English.
public bool Translate { get; set; }
Property Value
WordTimestamps
Gets or sets whether to include word-level timestamps.
public bool WordTimestamps { get; set; }