Class QuantileRegressionForests<T>

Namespace: AiDotNet.Regression

Assembly: AiDotNet.dll

Implements Quantile Regression Forests, an extension of Random Forests that can predict conditional quantiles of the target variable, not just the conditional mean.

public class QuantileRegressionForests<T> : AsyncDecisionTreeRegressionBase<T>, IAsyncTreeBasedModel<T>, ITreeBasedRegression<T>, INonLinearRegression<T>, IRegression<T>, IFullModel<T, Matrix<T>, Vector<T>>, IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>, IModelSerializer, ICheckpointableModel, IParameterizable<T, Matrix<T>, Vector<T>>, IFeatureAware, IFeatureImportance<T>, ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>, IGradientComputable<T, Matrix<T>, Vector<T>>, IJitCompilable<T>

Type Parameters

T: The numeric data type used for calculations (e.g., float, double).

Inheritance: object

AsyncDecisionTreeRegressionBase<T>

QuantileRegressionForests<T>

Implements: IAsyncTreeBasedModel<T>

ITreeBasedRegression<T>

INonLinearRegression<T>

IRegression<T>

IFullModel<T, Matrix<T>, Vector<T>>

IModel<Matrix<T>, Vector<T>, ModelMetadata<T>>

IModelSerializer

ICheckpointableModel

IParameterizable<T, Matrix<T>, Vector<T>>

IFeatureAware

IFeatureImportance<T>

ICloneable<IFullModel<T, Matrix<T>, Vector<T>>>

IGradientComputable<T, Matrix<T>, Vector<T>>

IJitCompilable<T>

Inherited Members: AsyncDecisionTreeRegressionBase<T>.NumOps

AsyncDecisionTreeRegressionBase<T>.Engine

AsyncDecisionTreeRegressionBase<T>.Root

AsyncDecisionTreeRegressionBase<T>.Options

AsyncDecisionTreeRegressionBase<T>.Regularization

AsyncDecisionTreeRegressionBase<T>.FeatureImportances

AsyncDecisionTreeRegressionBase<T>.Random

AsyncDecisionTreeRegressionBase<T>.FeatureNames

AsyncDecisionTreeRegressionBase<T>.Train(Matrix<T>, Vector<T>)

AsyncDecisionTreeRegressionBase<T>.Predict(Matrix<T>)

AsyncDecisionTreeRegressionBase<T>.GetParameters()

AsyncDecisionTreeRegressionBase<T>.WithParameters(Vector<T>)

AsyncDecisionTreeRegressionBase<T>.GetActiveFeatureIndices()

AsyncDecisionTreeRegressionBase<T>.IsFeatureUsed(int)

AsyncDecisionTreeRegressionBase<T>.SetParameters(Vector<T>)

AsyncDecisionTreeRegressionBase<T>.SetActiveFeatureIndices(IEnumerable<int>)

AsyncDecisionTreeRegressionBase<T>.GetFeatureImportance()

AsyncDecisionTreeRegressionBase<T>.DeepCopy()

AsyncDecisionTreeRegressionBase<T>.Clone()

AsyncDecisionTreeRegressionBase<T>.SaveModel(string)

AsyncDecisionTreeRegressionBase<T>.LoadModel(string)

AsyncDecisionTreeRegressionBase<T>.ParameterCount

AsyncDecisionTreeRegressionBase<T>.DefaultLossFunction

AsyncDecisionTreeRegressionBase<T>.ComputeGradients(Matrix<T>, Vector<T>, ILossFunction<T>)

AsyncDecisionTreeRegressionBase<T>.ApplyGradients(Vector<T>, T)

AsyncDecisionTreeRegressionBase<T>.SaveState(Stream)

AsyncDecisionTreeRegressionBase<T>.LoadState(Stream)

AsyncDecisionTreeRegressionBase<T>.UseSoftTree

AsyncDecisionTreeRegressionBase<T>.SoftTreeTemperature

AsyncDecisionTreeRegressionBase<T>.SupportsJitCompilation

AsyncDecisionTreeRegressionBase<T>.ExportComputationGraph(List<ComputationNode<T>>)

object.Equals(object)

object.Equals(object, object)

object.GetHashCode()

object.GetType()

object.MemberwiseClone()

object.ReferenceEquals(object, object)

object.ToString()

Extension Methods: DistributedExtensions.AsDistributedForHighBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributedForLowBandwidth<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, ICommunicationBackend<T>)

DistributedExtensions.AsDistributed<T, TInput, TOutput>(IFullModel<T, TInput, TOutput>, IShardingConfiguration<T>)

Remarks

Quantile Regression Forests extend the Random Forests algorithm to estimate the full conditional distribution of the response variable, not just its mean. This allows for prediction of any quantile of the response variable, providing a more complete picture of the relationship between predictors and the response.

The algorithm works by building multiple decision trees on bootstrap samples of the training data, similar to Random Forests. However, instead of averaging the predictions, it uses the empirical distribution of the predictions from all trees to estimate quantiles.

For Beginners: While standard Random Forests tell you the average prediction, Quantile Regression Forests can tell you about the entire range of possible outcomes. For example, they can predict not just the expected value, but also the 10th percentile (a pessimistic scenario) or the 90th percentile (an optimistic scenario). This is particularly useful when you need to understand the uncertainty in your predictions or when the relationship between variables varies across different parts of the distribution.

Constructors

QuantileRegressionForests(QuantileRegressionForestsOptions, IRegularization<T, Matrix<T>, Vector<T>>?)

Initializes a new instance of the QuantileRegressionForests class with the specified options and regularization.

public QuantileRegressionForests(QuantileRegressionForestsOptions options, IRegularization<T, Matrix<T>, Vector<T>>? regularization = null)

Parameters

options QuantileRegressionForestsOptions: Configuration options for the Quantile Regression Forests model.
regularization IRegularization<T, Matrix<T>, Vector<T>>: Regularization method to prevent overfitting. If null, no regularization will be applied.

Remarks

The constructor initializes the model with the provided options and sets up the random number generator.

For Beginners: This constructor sets up the Quantile Regression Forests model with your specified settings. The options control things like how many trees to build, how deep each tree can be, and how many features to consider at each split. Regularization is an optional technique to prevent the model from becoming too complex and overfitting to the training data.

Properties

MaxDepth

Gets the maximum depth of the trees in the forest.

public override int MaxDepth { get; }

Property Value

int: The maximum depth specified in the options.

NumberOfTrees

Gets the number of trees in the forest.

public override int NumberOfTrees { get; }

Property Value

int: The number of trees specified in the options.

Methods

CalculateFeatureImportancesAsync(int)

Asynchronously calculates the importance of each feature in the model.

protected override Task CalculateFeatureImportancesAsync(int numFeatures)

Parameters

numFeatures int: The number of features in the input data.

Returns

Task: A task that represents the asynchronous calculation operation.

Remarks

This method calculates feature importances by averaging the importances across all trees in the forest.

For Beginners: Feature importance tells you which input variables have the most influence on the predictions. In Quantile Regression Forests, this is calculated by averaging the feature importances from all the individual trees. Higher values indicate more important features.

CreateNewInstance()

Creates a new instance of the Quantile Regression Forests model with the same configuration.

protected override IFullModel<T, Matrix<T>, Vector<T>> CreateNewInstance()

Returns

IFullModel<T, Matrix<T>, Vector<T>>: A new instance of the Quantile Regression Forests model.

Remarks

This method creates a deep copy of the current model, including its configuration options, trained trees, feature importances, and regularization settings. The new instance is completely independent of the original, allowing modifications without affecting the original model.

For Beginners: This method creates an exact copy of your trained model.

Think of it like making a perfect clone of your forest model:

It copies all the configuration settings (number of trees, max depth, etc.)
It duplicates all the individual decision trees that make up the forest
It preserves the feature importance values that show which inputs matter most
It maintains all regularization settings that help prevent overfitting

Creating a copy is useful when you want to:

Create a backup before further modifying the model
Create variations of the same model for different purposes
Share the model with others while keeping your original intact

Exceptions

InvalidOperationException: Thrown when the creation fails or required components are null.

Deserialize(byte[])

Deserializes the model from a byte array.

public override void Deserialize(byte[] modelData)

Parameters

modelData byte[]: The byte array containing the serialized model data.

Remarks

This method reconstructs the model's parameters from a serialized byte array, including options, feature importances, and all trees in the forest.

For Beginners: Deserialization is the opposite of serialization - it takes the saved model data and reconstructs the model's internal state. This allows you to load a previously trained model and use it to make predictions without having to retrain it. It's like loading a saved game to continue where you left off.

GetModelMetadata()

Gets metadata about the model.

public override ModelMetadata<T> GetModelMetadata()

Returns

ModelMetadata<T>: A ModelMetadata object containing information about the model.

Remarks

This method returns metadata about the model, including its type, number of trees, maximum depth, and feature importances.

For Beginners: Model metadata provides information about the model itself, rather than the predictions it makes. This includes details about how the model is configured (like how many trees it uses and how deep they are) and information about the importance of different features. This can help you understand which input variables are most influential in making predictions.

PredictAsync(Matrix<T>)

Asynchronously makes predictions for the given input data.

public override Task<Vector<T>> PredictAsync(Matrix<T> input)

Parameters

input Matrix<T>: The input features matrix where each row is an example and each column is a feature.

Returns

Task<Vector<T>>: A task that represents the asynchronous prediction operation, containing a vector of predicted values.

Remarks

This method predicts the median (0.5 quantile) of the conditional distribution for each input example.

For Beginners: After training, this method is used to make predictions on new data. By default, it predicts the median value (the middle of the distribution), which is often a good central estimate. If you need a different percentile, you can use the PredictQuantileAsync method instead.

PredictQuantileAsync(Matrix<T>, double)

Asynchronously predicts a specific quantile of the target variable for the given input data.

public Task<Vector<T>> PredictQuantileAsync(Matrix<T> input, double quantile)

Parameters

input Matrix<T>: The input features matrix where each row is an example and each column is a feature.
quantile double: The quantile to predict, a value between 0 and 1.

Returns

Task<Vector<T>>: A task that represents the asynchronous prediction operation, containing a vector of predicted quantile values.

Remarks

This method predicts the specified quantile of the conditional distribution for each input example. The steps are: 1. Validate that the quantile is between 0 and 1 2. Apply regularization to the input matrix 3. Get predictions from all trees in parallel 4. For each input example: a. Sort the predictions from all trees b. Select the value at the position corresponding to the specified quantile 5. Apply regularization to the quantile predictions

For Beginners: This method predicts a specific percentile of the possible outcomes for each example in your input data. For instance, if you specify quantile=0.5, it predicts the median (middle value); if you specify quantile=0.9, it predicts the value below which 90% of the outcomes would fall. This is useful for understanding the range of possible outcomes and the uncertainty in your predictions.

Exceptions

ArgumentException: Thrown when the quantile is not between 0 and 1.

Serialize()

Serializes the model to a byte array.

public override byte[] Serialize()

Returns

byte[]: A byte array containing the serialized model data.

Remarks

This method serializes the model's parameters, including options, feature importances, and all trees in the forest.

For Beginners: Serialization converts the model's internal state into a format that can be saved to disk or transmitted over a network. This allows you to save a trained model and load it later without having to retrain it. Think of it like saving your progress in a video game.

TrainAsync(Matrix<T>, Vector<T>)

Asynchronously trains the Quantile Regression Forests model on the provided data.

public override Task TrainAsync(Matrix<T> x, Vector<T> y)

Parameters

x Matrix<T>: The input features matrix where each row is a training example and each column is a feature.
y Vector<T>: The target values vector corresponding to each training example.

Returns

Task: A task that represents the asynchronous training operation.

Remarks

This method builds multiple decision trees in parallel, each trained on a bootstrap sample of the training data. The steps are: 1. Clear any existing trees 2. For each tree: a. Create a new decision tree with the specified options b. Generate a bootstrap sample of the training data c. Train the tree on the bootstrap sample 3. Calculate feature importances by averaging across all trees

For Beginners: Training is the process where the model learns from your data. The algorithm builds multiple decision trees, each on a slightly different version of your data (created by random sampling with replacement). Each tree learns to predict the target variable based on the features. By building many trees and combining their predictions, the model can capture complex relationships and provide estimates of different quantiles (percentiles) of the target variable.

Table of Contents

Class QuantileRegressionForests<T>

Type Parameters

Remarks

Constructors

QuantileRegressionForests(QuantileRegressionForestsOptions, IRegularization<T, Matrix<T>, Vector<T>>?)

Parameters

Remarks

Properties

MaxDepth

Property Value

NumberOfTrees

Property Value

Methods

CalculateFeatureImportancesAsync(int)

Parameters

Returns

Remarks

CreateNewInstance()

Returns

Remarks

Exceptions

Deserialize(byte[])

Parameters

Remarks

GetModelMetadata()

Returns

Remarks

PredictAsync(Matrix<T>)

Parameters

Returns

Remarks

PredictQuantileAsync(Matrix<T>, double)

Parameters

Returns

Remarks

Exceptions

Serialize()

Returns

Remarks

TrainAsync(Matrix<T>, Vector<T>)

Parameters

Returns

Remarks