Class ResidualAnalysisFitDetectorOptions
Configuration options for the Residual Analysis Fit Detector, which evaluates model fit quality by analyzing prediction residuals against various statistical thresholds.
public class ResidualAnalysisFitDetectorOptions
- Inheritance
-
ResidualAnalysisFitDetectorOptions
- Inherited Members
Remarks
Residual analysis is a critical technique in regression modeling that examines the differences between observed values and predicted values (residuals) to assess model fit quality. This class provides configuration options for threshold values used to determine whether a model's residuals indicate a good fit. The detector evaluates several statistical measures including the mean of residuals, standard deviation of residuals, Mean Absolute Percentage Error (MAPE), and the coefficient of determination (R²). By adjusting these thresholds, users can control how strictly the detector evaluates model fit according to their specific requirements and domain knowledge.
For Beginners: This class helps you decide if your prediction model is doing a good job.
When a model makes predictions, it's rarely perfect. The differences between what your model predicted and the actual values are called "residuals." Analyzing these residuals helps determine if your model is working well.
Think of it like this:
- You have a weather app that predicts temperatures
- Some days it predicts 75°F when the actual temperature is 73°F (residual of -2°F)
- Other days it predicts 68°F when the actual temperature is 72°F (residual of +4°F)
- By analyzing all these differences, you can tell if your model is reliable
This class lets you set thresholds for different statistical measures:
- How close the average residual should be to zero
- How consistent the residuals should be (not too scattered)
- How small the percentage errors should be
- How much of the data variation your model explains
If your model's residuals stay within these thresholds, it passes the "fit test" and is considered reliable for making predictions.
Properties
MapeThreshold
Gets or sets the threshold for the Mean Absolute Percentage Error (MAPE).
public double MapeThreshold { get; set; }
Property Value
- double
A double value between 0 and 1, defaulting to 0.1 (representing 10%).
Remarks
The Mean Absolute Percentage Error (MAPE) measures the average of the absolute percentage errors between predicted and actual values. It expresses accuracy as a percentage, making it scale-independent and thus useful for comparing model performance across different datasets. This threshold determines the maximum acceptable MAPE for the model to be considered well-fitted. A smaller threshold enforces a stricter requirement for percentage accuracy, while a larger threshold allows larger percentage errors. The default value of 0.1 (representing 10%) provides a moderate constraint that is suitable for many applications but may need adjustment based on the specific domain and accuracy requirements.
For Beginners: This setting controls how large your model's percentage errors can be.
MAPE (Mean Absolute Percentage Error) measures errors as percentages rather than absolute values:
- It calculates: |actual - predicted| / |actual| for each prediction
- Then takes the average of these percentage errors
- This makes it easier to understand errors across different scales
The MapeThreshold value (default 0.1) means:
- The average percentage error should be no more than 10%
- Lower values (like 0.05): Stricter requirement, predictions must be within 5% on average
- Higher values (like 0.2): More lenient, allowing predictions to be off by 20% on average
For example, if predicting house prices:
- A $200,000 house predicted as $220,000 has a 10% error
- A $500,000 house predicted as $550,000 has a 10% error
- With the default threshold of 0.1, these would be right at the acceptable limit
MAPE is especially useful when your data spans different scales, as it normalizes errors to percentages rather than absolute values.
MeanThreshold
Gets or sets the threshold for the mean (average) of residuals.
public double MeanThreshold { get; set; }
Property Value
- double
A double value between 0 and 1, defaulting to 0.1.
Remarks
The mean of residuals measures the average difference between predicted and actual values. In an ideal model, this value should be close to zero, indicating that the model does not systematically overpredict or underpredict. This threshold determines how close to zero the mean residual must be for the model to be considered well-fitted. A smaller threshold enforces a stricter requirement for the model to have balanced residuals, while a larger threshold allows more systematic bias in the predictions. The default value of 0.1 provides a moderate constraint that is suitable for many applications but may need adjustment based on the specific domain and data characteristics.
For Beginners: This setting controls how close the average error should be to zero.
In an ideal prediction model:
- Some predictions are too high (positive residuals)
- Some predictions are too low (negative residuals)
- These errors should balance out, with an average close to zero
The MeanThreshold value (default 0.1) determines how close to zero this average must be:
- Lower values (like 0.05): Stricter requirement, the model must have very balanced errors
- Higher values (like 0.2): More lenient, allowing some systematic bias in predictions
For example, if your model consistently predicts temperatures that are 2 degrees too high, it would have a mean residual of 2. With the default threshold of 0.1, this might be considered too biased, depending on the scale of your data.
Adjust this threshold based on how important it is that your model doesn't consistently over-predict or under-predict values.
R2Threshold
public double R2Threshold { get; set; }
Property Value
StdThreshold
Gets or sets the threshold for the standard deviation of residuals.
public double StdThreshold { get; set; }
Property Value
- double
A double value between 0 and 1, defaulting to 0.2.
Remarks
The standard deviation of residuals measures how widely the residuals are dispersed from their mean. A lower standard deviation indicates that the residuals are clustered more tightly around the mean, suggesting more consistent prediction errors. This threshold determines the maximum acceptable standard deviation for the model to be considered well-fitted. A smaller threshold enforces a stricter requirement for consistent prediction errors, while a larger threshold allows more variability in the residuals. The default value of 0.2 provides a moderate constraint that is suitable for many applications but may need adjustment based on the specific domain and data characteristics.
For Beginners: This setting controls how consistent your model's errors should be.
Standard deviation measures how scattered or spread out your errors are:
- Low standard deviation: Errors are consistently similar in size
- High standard deviation: Some errors are very small while others are very large
The StdThreshold value (default 0.2) determines how consistent these errors must be:
- Lower values (like 0.1): Stricter requirement, errors must be very consistent
- Higher values (like 0.3): More lenient, allowing some predictions to be much further off
For example, if your weather model is usually within 1-2 degrees but occasionally off by 10 degrees, it would have a high standard deviation of residuals. With the default threshold of 0.2, this might be considered too inconsistent.
Adjust this threshold based on how important it is that your model makes consistently reliable predictions versus occasionally having larger errors.