Interface IAudioEventDetector<T>
- Namespace
- AiDotNet.Interfaces
- Assembly
- AiDotNet.dll
Interface for audio event detection models that identify specific sounds/events in audio.
public interface IAudioEventDetector<T> : IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>
Type Parameters
TThe numeric type used for calculations.
- Inherited Members
- Extension Methods
Remarks
Audio event detection identifies when specific sounds occur in an audio stream. Unlike classification which assigns one label to entire clips, event detection finds multiple events with their timestamps.
For Beginners: Event detection is like having a listener who notes down every distinct sound they hear and when it happened.
How it works:
- Audio is analyzed in overlapping windows
- Each window is classified for the presence of various events
- Consecutive detections are merged into event segments
Types of events:
- Environmental: Car horn, dog bark, siren, glass breaking
- Speech: Laughter, cough, scream, applause
- Music: Drum hit, guitar strum, piano note
- Industrial: Machine alarm, tool sounds
Use cases:
- Security/surveillance (detect gunshots, breaking glass)
- Smart home (detect doorbell, smoke alarm, baby crying)
- Wildlife monitoring (detect animal calls)
- Content moderation (detect inappropriate sounds)
- Accessibility (alert deaf users to sounds)
Challenges:
- Overlapping events (multiple sounds at once)
- Variable event duration (short beep vs long siren)
- Background noise interference
This interface extends IFullModel<T, TInput, TOutput> for Tensor-based audio processing.
Properties
IsOnnxMode
Gets whether this model is running in ONNX inference mode.
bool IsOnnxMode { get; }
Property Value
SampleRate
Gets the expected sample rate for input audio.
int SampleRate { get; }
Property Value
SupportedEvents
Gets the list of event types this model can detect.
IReadOnlyList<string> SupportedEvents { get; }
Property Value
TimeResolution
Gets the time resolution for event detection in seconds.
double TimeResolution { get; }
Property Value
Methods
Detect(Tensor<T>)
Detects audio events in the audio stream.
AudioEventResult<T> Detect(Tensor<T> audio)
Parameters
audioTensor<T>Audio waveform tensor [samples] or [channels, samples].
Returns
- AudioEventResult<T>
Event detection result with detected events.
Remarks
For Beginners: This is the main method for detecting events. - Pass in audio - Get back a list of detected sounds and when they occurred
Detect(Tensor<T>, T)
Detects audio events in the audio stream with custom threshold.
AudioEventResult<T> Detect(Tensor<T> audio, T threshold)
Parameters
audioTensor<T>Audio waveform tensor [samples] or [channels, samples].
thresholdTDetection threshold (0.0 to 1.0). Lower = more sensitive.
Returns
- AudioEventResult<T>
Event detection result with detected events.
DetectAsync(Tensor<T>, CancellationToken)
Detects audio events asynchronously.
Task<AudioEventResult<T>> DetectAsync(Tensor<T> audio, CancellationToken cancellationToken = default)
Parameters
audioTensor<T>Audio waveform tensor.
cancellationTokenCancellationTokenCancellation token for async operation.
Returns
- Task<AudioEventResult<T>>
Event detection result.
DetectSpecific(Tensor<T>, IReadOnlyList<string>)
Detects specific events only.
AudioEventResult<T> DetectSpecific(Tensor<T> audio, IReadOnlyList<string> eventTypes)
Parameters
audioTensor<T>Audio waveform tensor.
eventTypesIReadOnlyList<string>Event types to detect.
Returns
- AudioEventResult<T>
Event detection result filtered to specified types.
Remarks
For Beginners: Use this when you only care about specific sounds. - DetectSpecific(audio, ["dog_bark", "siren"]) only looks for dogs and sirens
DetectSpecific(Tensor<T>, IReadOnlyList<string>, T)
Detects specific events only with custom threshold.
AudioEventResult<T> DetectSpecific(Tensor<T> audio, IReadOnlyList<string> eventTypes, T threshold)
Parameters
audioTensor<T>Audio waveform tensor.
eventTypesIReadOnlyList<string>Event types to detect.
thresholdTDetection threshold.
Returns
- AudioEventResult<T>
Event detection result filtered to specified types.
GetEventProbabilities(Tensor<T>)
Gets frame-level event probabilities.
Tensor<T> GetEventProbabilities(Tensor<T> audio)
Parameters
audioTensor<T>Audio waveform tensor.
Returns
- Tensor<T>
Tensor of event probabilities [time_frames, num_events].
Remarks
Useful for visualization or custom post-processing of detections.
StartStreamingSession()
Performs real-time event detection on a streaming session.
IStreamingEventDetectionSession<T> StartStreamingSession()
Returns
- IStreamingEventDetectionSession<T>
Streaming session for real-time detection.
StartStreamingSession(int, T)
Performs real-time event detection on a streaming session with custom settings.
IStreamingEventDetectionSession<T> StartStreamingSession(int sampleRate, T threshold)
Parameters
sampleRateintSample rate of incoming audio.
thresholdTDetection threshold.
Returns
- IStreamingEventDetectionSession<T>
Streaming session for real-time detection.