Class OrthogonalRegressionOptions<T>

Namespace: AiDotNet.Models.Options

Assembly: AiDotNet.dll

Configuration options for Orthogonal Regression (also known as Total Least Squares), which minimizes the perpendicular distances from data points to the fitted model, accounting for errors in both dependent and independent variables.

public class OrthogonalRegressionOptions<T> : RegressionOptions<T>

Type Parameters

T

Inheritance: object

ModelOptions

RegressionOptions<T>

OrthogonalRegressionOptions<T>

Inherited Members: RegressionOptions<T>.DecompositionMethod

RegressionOptions<T>.UseIntercept

ModelOptions.Seed

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Remarks

Orthogonal Regression differs from standard regression techniques by considering measurement errors in both the predictor (independent) and response (dependent) variables. While ordinary least squares regression minimizes vertical distances from points to the regression line, orthogonal regression minimizes perpendicular distances, making it more appropriate when both variables contain measurement error or uncertainty. This approach is particularly valuable in fields like physics, chemistry, and engineering where measurement instruments may introduce errors in all variables. The algorithm typically employs singular value decomposition or iterative methods to find the optimal solution.

For Beginners: Orthogonal Regression is a special type of regression that treats all variables fairly when finding patterns.

In standard regression:

We assume that only the y-variable (what we're predicting) contains errors
We minimize the vertical distances from points to the line

Imagine measuring the heights and weights of people:

Standard regression assumes heights are measured perfectly, only weights have errors
Orthogonal regression recognizes that both height AND weight measurements have errors

This matters because:

When both variables have measurement errors, standard regression can give biased results
Orthogonal regression fits a line that's "fair" to both variables
The line minimizes the perpendicular distance from points to the line, not just vertical distance

This technique is especially useful in scientific applications where:

All measurements come from instruments with known error rates
We're looking for true physical relationships rather than just predictions
The variables play symmetrical roles rather than strictly "input" and "output"

This class lets you configure how the orthogonal regression algorithm works, controlling its precision, computational limits, and data preprocessing.

Properties

DecompositionType

Gets or sets the matrix decomposition type to use when solving the linear system.

public MatrixDecompositionType DecompositionType { get; set; }

Property Value

MatrixDecompositionType: The matrix decomposition type, defaulting to SVD decomposition.

Remarks

The decomposition type determines how the system of linear equations is solved during optimization. SVD (Singular Value Decomposition) is particularly well-suited for orthogonal regression as it naturally handles the total least squares formulation.

For Beginners: This setting controls the mathematical method used to solve equations during model fitting. The default SVD method is ideal for orthogonal regression as it handles measurement errors in all variables properly.

MaxIterations

Gets or sets the maximum number of iterations allowed for the optimization algorithm.

public int MaxIterations { get; set; }

Property Value

int: The maximum number of iterations, defaulting to 100.

Remarks

This parameter sets an upper limit on how many iterations the optimization algorithm will perform when fitting the orthogonal regression model. Each iteration refines the model parameters to better minimize the sum of squared perpendicular distances. The algorithm may terminate earlier if convergence is achieved based on the tolerance value. Orthogonal regression often requires an iterative approach, especially for nonlinear models. The appropriate number of iterations depends on the complexity of the relationship, the number of data points, and the initial parameter values.

For Beginners: This setting limits how many attempts the algorithm makes to improve its solution before stopping.

The default value of 100 means:

The algorithm will make at most 100 attempts to refine the solution
It might stop earlier if it reaches the desired precision (set by Tolerance)

Imagine polishing a surface:

Each iteration is like one pass with the polishing cloth
You want enough passes to get a good finish, but not waste time after it's already smooth

You might want more iterations (like 500) if:

You're working with complex relationships
You notice the algorithm is still improving significantly at 100 iterations
You have a very strict tolerance setting

You might want fewer iterations (like 50) if:

You need faster results
Your data is well-behaved and converges quickly
You're doing preliminary analysis

This setting works together with Tolerance - the algorithm stops when either:

It reaches the maximum number of iterations, OR
The improvement between iterations becomes smaller than the tolerance

For most applications, the default of 100 iterations provides a good balance between thorough optimization and reasonable computation time.

ScaleVariables

Gets or sets whether to standardize variables before fitting the model.

public bool ScaleVariables { get; set; }

Property Value

bool: Whether to scale variables, defaulting to true.

Remarks

This parameter determines whether the input variables should be standardized (scaled to have zero mean and unit variance) before fitting the orthogonal regression model. Standardization is particularly important for orthogonal regression because the method minimizes perpendicular distances, which are directly affected by the scale of each variable. Without standardization, variables with larger scales would dominate the optimization. By default, this is set to true, which ensures that variables with different units or scales contribute equally to the model fit. The standardization is reversed when making predictions, so outputs are returned in the original scale.

For Beginners: This setting determines whether the algorithm should adjust all variables to a similar scale before finding the best fit.

The default value of true means:

Before fitting, all variables are rescaled to have similar ranges
This ensures that no variable dominates just because it uses larger numbers

For example, if you're relating:

Age (typically 0-100 years) and
Income (typically thousands or tens of thousands of dollars)

Without scaling:

Income would dominate the calculations because its numbers are much larger
The resulting line might fit the income well but ignore patterns in age

With scaling (the default):

Both variables are adjusted to similar ranges (typically mean 0, variance 1)
The resulting line treats both variables fairly

You might want to set this to false if:

Your variables are already on the same scale
You specifically want variables with larger values to have more influence
You have domain-specific reasons to preserve the original scales

In most cases, leaving this set to true is recommended, especially when variables have different units or widely different numerical ranges.

Tolerance

Gets or sets the convergence tolerance that determines when the iterative optimization algorithm should stop.

public double Tolerance { get; set; }

Property Value

double: The convergence tolerance, defaulting to 0.000001 (1e-6).

Remarks

This parameter defines the threshold for determining when the optimization has converged. The algorithm will stop when the improvement between consecutive iterations falls below this tolerance value. A smaller tolerance requires more precision in the parameter estimates, potentially leading to better model fit but requiring more iterations. This is particularly important in orthogonal regression where finding the optimal solution often requires iterative approaches. The appropriate value depends on the scale of your variables and the required precision of your model.

For Beginners: This setting controls how precise the algorithm should be before deciding it has found the best solution.

The default value of 0.000001 (one millionth) means:

If consecutive iterations of the algorithm improve the solution by less than one millionth
The algorithm decides it's "close enough" and stops

Think of it like measuring ingredients for a recipe:

A small tolerance is like measuring to the nearest milligram (very precise)
A larger tolerance is like measuring to the nearest gram (less precise)

You might want a smaller value (like 1e-8) if:

Your application requires extremely high precision
You have well-conditioned data with minimal noise
You're using the results for sensitive scientific calculations

You might want a larger value (like 1e-4) if:

You need faster computations
Your data contains substantial noise anyway
You're doing exploratory analysis rather than final modeling

Finding the right tolerance balances precision with computational efficiency. Too strict (too small) and the algorithm might take unnecessarily long; too loose (too large) and you might get suboptimal results.

Table of Contents

Class OrthogonalRegressionOptions<T>

Type Parameters

Remarks

Properties

DecompositionType

Property Value

Remarks

MaxIterations

Property Value

Remarks

ScaleVariables

Property Value

Remarks

Tolerance

Property Value

Remarks