Table of Contents

Class BasicStats<T>

Namespace
AiDotNet.Statistics
Assembly
AiDotNet.dll

Provides a collection of basic statistical measures for a set of numeric values.

public class BasicStats<T>

Type Parameters

T

The numeric type used for calculations (e.g., float, double).

Inheritance
BasicStats<T>
Inherited Members

Remarks

BasicStats calculates and stores a comprehensive set of descriptive statistics for a collection of values, including measures of central tendency, dispersion, and distribution shape. These statistics provide insights into the characteristics of the data distribution.

For Beginners: Think of BasicStats as a calculator that analyzes a set of numbers and tells you their important patterns and characteristics.

It answers questions like:

  • What's the typical value? (Mean, Median)
  • How spread out are the values? (Variance, StandardDeviation)
  • What's the range of values? (Min, Max, InterquartileRange)
  • Is the distribution skewed or symmetric? (Skewness)
  • Are there unusual extreme values? (Kurtosis)

For example, if you have test scores from a class:

  • Mean tells you the average score
  • StandardDeviation tells you how much scores vary from that average
  • Skewness might reveal if more students scored above or below average

These statistics help you understand your data at a glance without having to examine every value.

Properties

FirstQuartile

Gets the first quartile (25th percentile) of the dataset.

public T FirstQuartile { get; }

Property Value

T

Remarks

The first quartile is the value below which 25% of the observations in the dataset fall. It represents the median of the lower half of the data.

For Beginners: The first quartile is the value where 25% of your data falls below it.

If you divide your sorted data into four equal parts, the first quartile is the value at the boundary of the first and second parts.

For example, in a class of 20 students, the first quartile test score is the score that 5 students scored below and 15 students scored above.

InterquartileRange

Gets the interquartile range (IQR) of the dataset.

public T InterquartileRange { get; }

Property Value

T

Remarks

The interquartile range is the difference between the third and first quartiles. It represents the middle 50% of the data and is a robust measure of dispersion that is less sensitive to outliers than the range or standard deviation.

For Beginners: The interquartile range (IQR) measures the spread of the middle 50% of your data.

It's calculated as: IQR = Third Quartile - First Quartile

The IQR is useful because:

  • It ignores the extreme values (potential outliers)
  • It gives you the range where the "typical" values fall
  • It's used to identify outliers (values more than 1.5 × IQR from the quartiles)

For example, if test scores have an IQR of 15 points, it means the middle 50% of students' scores span a 15-point range.

Kurtosis

public T Kurtosis { get; }

Property Value

T

MAD

Gets the median absolute deviation (MAD) of the dataset.

public T MAD { get; }

Property Value

T

Remarks

The median absolute deviation is the median of the absolute deviations from the data's median. It is a robust measure of variability that is less sensitive to outliers than the standard deviation.

For Beginners: MAD is another way to measure spread that's less affected by outliers.

To calculate MAD:

  1. Find the median of your data
  2. Calculate how far each value is from the median (absolute deviation)
  3. Find the median of those distances

MAD is useful when your data has outliers that might skew other measures like standard deviation. Think of it as measuring the "typical" distance from the center, ignoring extreme values.

Max

Gets the maximum value in the dataset.

public T Max { get; }

Property Value

T

Remarks

The maximum is the largest value in the dataset. It represents the upper bound of the data range.

For Beginners: The maximum is simply the largest number in your data.

For the numbers [5, 12, 3, 8, 9], the maximum is 12.

Together with the minimum, it tells you the full range of your data.

Mean

Gets the arithmetic mean (average) of the values.

public T Mean { get; }

Property Value

T

Remarks

The mean is calculated by summing all values and dividing by the number of values. It represents the central tendency of the data, but can be sensitive to outliers.

For Beginners: The mean is the average value - add up all the numbers and divide by how many there are.

For example, for the numbers [2, 4, 6, 8, 10]:

  • Sum: 2 + 4 + 6 + 8 + 10 = 30
  • Count: 5
  • Mean: 30 ÷ 5 = 6

The mean gives you the "center" of your data, but can be pulled in the direction of very large or small values.

Median

Gets the median value of the dataset.

public T Median { get; }

Property Value

T

Remarks

The median is the middle value when the data is sorted in ascending order. For an even number of values, it is the average of the two middle values. The median is less sensitive to outliers than the mean.

For Beginners: The median is the middle value when you arrange all numbers in order.

To find the median:

  1. Sort all numbers from smallest to largest
  2. If there's an odd number of values, take the middle one
  3. If there's an even number, take the average of the two middle values

For example:

  • For [3, 5, 8, 9, 12] (odd count), the median is 8
  • For [3, 5, 8, 9, 12, 15] (even count), the median is (8 + 9) ÷ 2 = 8.5

The median is often better than the mean for describing "typical" values when your data has outliers.

Min

Gets the minimum value in the dataset.

public T Min { get; }

Property Value

T

Remarks

The minimum is the smallest value in the dataset. It represents the lower bound of the data range.

For Beginners: The minimum is simply the smallest number in your data.

For the numbers [5, 12, 3, 8, 9], the minimum is 3.

It's useful to know the extreme values in your data, especially when looking for outliers or setting valid ranges.

N

Gets the number of values in the dataset.

public int N { get; }

Property Value

int

Remarks

N represents the count of values in the dataset. It is used in various statistical calculations and indicates the sample size.

For Beginners: N is just the count of how many numbers are in your data.

For the numbers [5, 12, 3, 8, 9], N is 5.

The sample size is important because larger samples generally give more reliable statistics.

Skewness

public T Skewness { get; }

Property Value

T

StandardDeviation

Gets the standard deviation of the values, a measure of dispersion.

public T StandardDeviation { get; }

Property Value

T

Remarks

Standard deviation is the square root of the variance. It measures the amount of variation or dispersion in the dataset. It is in the same units as the original data, making it more interpretable than variance.

For Beginners: Standard deviation is the most common way to measure how spread out your data is.

It's the square root of the variance, which puts it back in the same units as your original numbers.

For example:

  • Low standard deviation: Most values are close to the average
  • High standard deviation: Values are widely spread from the average

If you're looking at test scores with a mean of 75 and standard deviation of 5, you know most scores fall roughly between 70 and 80.

ThirdQuartile

Gets the third quartile (75th percentile) of the dataset.

public T ThirdQuartile { get; }

Property Value

T

Remarks

The third quartile is the value below which 75% of the observations in the dataset fall. It represents the median of the upper half of the data.

For Beginners: The third quartile is the value where 75% of your data falls below it.

If you divide your sorted data into four equal parts, the third quartile is the value at the boundary of the third and fourth parts.

For example, in a class of 20 students, the third quartile test score is the score that 15 students scored below and 5 students scored above.

Variance

Gets the variance of the values, a measure of dispersion.

public T Variance { get; }

Property Value

T

Remarks

Variance measures how far each value in the dataset is from the mean. It is calculated as the average of the squared differences from the mean. Larger variance indicates greater dispersion in the data.

For Beginners: Variance measures how spread out the numbers are from the average.

To calculate variance:

  1. Find how far each number is from the mean
  2. Square each of those differences (to make everything positive)
  3. Find the average of those squared differences

Higher variance means values are more spread out; lower variance means they're more clustered around the mean.

Methods

Empty()

Creates an empty BasicStats object with all statistics set to their default values.

public static BasicStats<T> Empty()

Returns

BasicStats<T>

An empty BasicStats object.

Remarks

This factory method creates and returns a BasicStats object with no input values. All statistics will be initialized to their default values (typically zero). This can be useful when needing to initialize a statistics object before data is available.

For Beginners: This creates an empty statistics object with no data.

Use this method when:

  • You need a placeholder statistics object to fill in later
  • You're initializing a statistics object but don't have the data yet
  • You need a "zero" baseline to compare against

All the statistics in this empty object will be set to zero or their equivalent default values.

GetMetric(MetricType)

Gets the value of a specific metric based on the provided MetricType.

public T GetMetric(MetricType metricType)

Parameters

metricType MetricType

The type of metric to retrieve.

Returns

T

The value of the requested metric.

Remarks

This method allows you to retrieve any of the calculated statistics by specifying the desired metric type. It provides a flexible way to access individual metrics without needing separate properties for each.

For Beginners: This method is like a vending machine for statistics.

You tell it which statistic you want (using the MetricType), and it gives you the value. For example:

  • If you ask for MetricType.Mean, it gives you the average
  • If you ask for MetricType.StandardDeviation, it gives you the standard deviation

This is useful when you want to work with different statistics in a flexible way, especially if you don't know in advance which statistic you'll need.

HasMetric(MetricType)

Checks if a specific metric is available in this BasicStats instance.

public bool HasMetric(MetricType metricType)

Parameters

metricType MetricType

The type of metric to check for.

Returns

bool

True if the metric is available, false otherwise.

Remarks

For Beginners: This method allows you to check if a particular metric is available before trying to get its value. It's useful when you're not sure if a specific metric was calculated for this set of basic stats.

For example:

if (stats.HasMetric(MetricType.Mean))
{
    var meanValue = stats.GetMetric(MetricType.Mean);
    // Use meanValue...
}

This prevents errors that might occur if you try to access a metric that wasn't calculated.