Interface IDiagnosticsProvider

Namespace: AiDotNet.Interfaces

Assembly: AiDotNet.dll

Interface for components that provide diagnostic information for monitoring and debugging.

public interface IDiagnosticsProvider

Remarks

This interface enables neural network components (layers, networks, loss functions, etc.) to provide detailed diagnostic information about their internal state and behavior. This is particularly useful for:

Monitoring training progress
Debugging model behavior
Performance analysis and optimization
Understanding model decisions (explainability)

For Beginners: Think of this as a "health report" interface for neural network components.

Just like you might want to check various health metrics for your body (heart rate, blood pressure, etc.), you want to monitor various metrics for your neural network components during training and inference.

Real-world analogy: Imagine you're driving a car. Your dashboard shows:

Speed (how fast you're going)
RPM (engine revolutions)
Fuel level (remaining energy)
Temperature (engine heat)

Similarly, a neural network layer might report:

Activation statistics (min, max, mean values)
Gradient flow (how well training signals propagate)
Resource utilization (memory usage, computation time)
Layer-specific metrics (attention weights, expert usage, etc.)

This information helps you understand:

Is my model training properly?
Are there any bottlenecks or issues?
Which parts of the model are most active?
Is the model behaving as expected?

Industry Best Practices:

Consistent Keys: Use standardized key names across similar components
Meaningful Values: Provide human-readable string representations
Hierarchical Organization: Use prefixes to group related metrics (e.g., "activation.mean", "activation.std")
Efficient Computation: Diagnostics should be cheap to compute or cached
Optional Depth: Consider providing basic and detailed diagnostic modes

Implementation Example:

public class DenseLayer<T> : LayerBase<T>, IDiagnosticsProvider<T>
{
    private Tensor<T>? _lastActivations;

    public Dictionary<string, string> GetDiagnostics()
    {
        var diagnostics = new Dictionary<string, string>();

        if (_lastActivations != null)
        {
            diagnostics["activation.mean"] = ComputeMean(_lastActivations).ToString();
            diagnostics["activation.std"] = ComputeStd(_lastActivations).ToString();
            diagnostics["activation.sparsity"] = ComputeSparsity(_lastActivations).ToString();
        }

        diagnostics["parameter.count"] = ParameterCount.ToString();
        diagnostics["layer.type"] = "Dense";

        return diagnostics;
    }
}

// In monitoring code:
foreach (var layer in network.Layers)
{
    if (layer is IDiagnosticsProvider<T> diagnosticLayer)
    {
        var metrics = diagnosticLayer.GetDiagnostics();
        LogMetrics(metrics);
    }
}

Methods

GetDiagnostics()

Gets diagnostic information about this component's state and behavior.

Dictionary<string, string> GetDiagnostics()

Returns

Dictionary<string, string>: A dictionary containing diagnostic metrics. Keys should be descriptive and use consistent naming conventions (e.g., "activation.mean", "gradient.norm"). Values should be human-readable string representations of the metrics.

Remarks

This method should return diagnostic information that is useful for understanding the component's current state. The specific metrics returned depend on the component type:

Layers: Activation statistics, gradient flow, sparsity, etc.
Networks: Aggregate metrics, layer-by-layer summaries
Loss Functions: Loss components, regularization terms
Optimizers: Learning rates, momentum values, update statistics

For Beginners: This method returns a report card with various metrics.

The returned dictionary is like a set of labeled measurements:

Keys: What you're measuring (e.g., "mean_activation", "sparsity")
Values: The measurement results as strings (e.g., "0.42", "85% sparse")

Example for a Dense layer:

{
    "activation.mean": "0.342",
    "activation.std": "0.156",
    "activation.min": "-0.82",
    "activation.max": "1.24",
    "activation.sparsity": "0.23",
    "gradient.norm": "0.042",
    "weights.norm": "15.6",
    "layer.output_size": "256"
}

You can use this information to:

Detect training issues: If activations are all zero, something might be wrong
Tune hyperparameters: If gradients are too large/small, adjust learning rate
Monitor convergence: Track metrics over time to see if training is progressing
Compare experiments: See how different configurations affect internal behavior

Common patterns:

// Log diagnostics periodically during training
if (epoch % 10 == 0)
{
    foreach (var layer in network.Layers)
    {
        if (layer is IDiagnosticsProvider<T> diag)
        {
            Console.WriteLine($"Layer {layer.Name}:");
            foreach (var (key, value) in diag.GetDiagnostics())
            {
                Console.WriteLine($"  {key}: {value}");
            }
        }
    }
}

Performance Note: Diagnostic computation should be efficient. If expensive calculations are needed, consider caching results or computing them only when diagnostics are requested. Diagnostics should not significantly impact training performance.

Table of Contents

Interface IDiagnosticsProvider

Remarks

Methods

GetDiagnostics()

Returns

Remarks