Table of Contents

Interface IAudioEventDetector<T>

Namespace
AiDotNet.Interfaces
Assembly
AiDotNet.dll

Interface for audio event detection models that identify specific sounds/events in audio.

public interface IAudioEventDetector<T> : IFullModel<T, Tensor<T>, Tensor<T>>, IModel<Tensor<T>, Tensor<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Tensor<T>, Tensor<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Tensor<T>, Tensor<T>>>, IGradientComputable<T, Tensor<T>, Tensor<T>>, IJitCompilable<T>

Type Parameters

T

The numeric type used for calculations.

Inherited Members
Extension Methods

Remarks

Audio event detection identifies when specific sounds occur in an audio stream. Unlike classification which assigns one label to entire clips, event detection finds multiple events with their timestamps.

For Beginners: Event detection is like having a listener who notes down every distinct sound they hear and when it happened.

How it works:

  1. Audio is analyzed in overlapping windows
  2. Each window is classified for the presence of various events
  3. Consecutive detections are merged into event segments

Types of events:

  • Environmental: Car horn, dog bark, siren, glass breaking
  • Speech: Laughter, cough, scream, applause
  • Music: Drum hit, guitar strum, piano note
  • Industrial: Machine alarm, tool sounds

Use cases:

  • Security/surveillance (detect gunshots, breaking glass)
  • Smart home (detect doorbell, smoke alarm, baby crying)
  • Wildlife monitoring (detect animal calls)
  • Content moderation (detect inappropriate sounds)
  • Accessibility (alert deaf users to sounds)

Challenges:

  • Overlapping events (multiple sounds at once)
  • Variable event duration (short beep vs long siren)
  • Background noise interference

This interface extends IFullModel<T, TInput, TOutput> for Tensor-based audio processing.

Properties

IsOnnxMode

Gets whether this model is running in ONNX inference mode.

bool IsOnnxMode { get; }

Property Value

bool

SampleRate

Gets the expected sample rate for input audio.

int SampleRate { get; }

Property Value

int

SupportedEvents

Gets the list of event types this model can detect.

IReadOnlyList<string> SupportedEvents { get; }

Property Value

IReadOnlyList<string>

TimeResolution

Gets the time resolution for event detection in seconds.

double TimeResolution { get; }

Property Value

double

Methods

Detect(Tensor<T>)

Detects audio events in the audio stream.

AudioEventResult<T> Detect(Tensor<T> audio)

Parameters

audio Tensor<T>

Audio waveform tensor [samples] or [channels, samples].

Returns

AudioEventResult<T>

Event detection result with detected events.

Remarks

For Beginners: This is the main method for detecting events. - Pass in audio - Get back a list of detected sounds and when they occurred

Detect(Tensor<T>, T)

Detects audio events in the audio stream with custom threshold.

AudioEventResult<T> Detect(Tensor<T> audio, T threshold)

Parameters

audio Tensor<T>

Audio waveform tensor [samples] or [channels, samples].

threshold T

Detection threshold (0.0 to 1.0). Lower = more sensitive.

Returns

AudioEventResult<T>

Event detection result with detected events.

DetectAsync(Tensor<T>, CancellationToken)

Detects audio events asynchronously.

Task<AudioEventResult<T>> DetectAsync(Tensor<T> audio, CancellationToken cancellationToken = default)

Parameters

audio Tensor<T>

Audio waveform tensor.

cancellationToken CancellationToken

Cancellation token for async operation.

Returns

Task<AudioEventResult<T>>

Event detection result.

DetectSpecific(Tensor<T>, IReadOnlyList<string>)

Detects specific events only.

AudioEventResult<T> DetectSpecific(Tensor<T> audio, IReadOnlyList<string> eventTypes)

Parameters

audio Tensor<T>

Audio waveform tensor.

eventTypes IReadOnlyList<string>

Event types to detect.

Returns

AudioEventResult<T>

Event detection result filtered to specified types.

Remarks

For Beginners: Use this when you only care about specific sounds. - DetectSpecific(audio, ["dog_bark", "siren"]) only looks for dogs and sirens

DetectSpecific(Tensor<T>, IReadOnlyList<string>, T)

Detects specific events only with custom threshold.

AudioEventResult<T> DetectSpecific(Tensor<T> audio, IReadOnlyList<string> eventTypes, T threshold)

Parameters

audio Tensor<T>

Audio waveform tensor.

eventTypes IReadOnlyList<string>

Event types to detect.

threshold T

Detection threshold.

Returns

AudioEventResult<T>

Event detection result filtered to specified types.

GetEventProbabilities(Tensor<T>)

Gets frame-level event probabilities.

Tensor<T> GetEventProbabilities(Tensor<T> audio)

Parameters

audio Tensor<T>

Audio waveform tensor.

Returns

Tensor<T>

Tensor of event probabilities [time_frames, num_events].

Remarks

Useful for visualization or custom post-processing of detections.

StartStreamingSession()

Performs real-time event detection on a streaming session.

IStreamingEventDetectionSession<T> StartStreamingSession()

Returns

IStreamingEventDetectionSession<T>

Streaming session for real-time detection.

StartStreamingSession(int, T)

Performs real-time event detection on a streaming session with custom settings.

IStreamingEventDetectionSession<T> StartStreamingSession(int sampleRate, T threshold)

Parameters

sampleRate int

Sample rate of incoming audio.

threshold T

Detection threshold.

Returns

IStreamingEventDetectionSession<T>

Streaming session for real-time detection.