Module std_dev::regression

source ·
Available on crate feature regression only.
Expand description

Various regression models to fit the best line to your data. All written to be understandable.

Vocabulary:

  • Predictors - the independent values (usually denoted x) from which we want a equation to get the:
  • outcomes - the dependant variables. Usually y or f(x).
  • model - create an equation which optimally (can optimize for different priorities) fits the data.

The *Coefficients structs implement Predictive which calculates the predicted outcomes using the model and their determination; and Display which can be used to show the equations.

Linear regressions are often used by other regression methods. All linear regressions therefore implement the LinearEstimator trait. You can use the *Linear structs to choose which method to use.

Info on implementation

Details and comments on implementation can be found as docs under each item.

Power & exponent

See derived for the implementations.

I reverse the exponentiation to get a linear model. Then, I solve it using the method linked above. Then, I transform the returned variables to fit the target model.

This is not very good, as the errors of large values are reduced compared to small values when taking the logarithm. I have plans to address this bias in the future. The current behaviour is however still probably the desired behaviour, as small values are often relatively important to larger.

Many programs (including LibreOffice Calc) simply discards negative & zero values. I chose to go the explicit route and add additional terms to satisfy requirements. This is naturally a fallback, and should be a warning sign your data is bad.

Under these methods the calculations are inserted, and how to handle the data.

Re-exports

Modules

  • arbitrary_linear_algebraarbitrary-precision
    This module enables the use of rug::Float inside of nalgebra.
  • A random binary searching n-variable optimizer.
  • Estimators derived from others, usual LinearEstimator.
  • Assumes the fitness function has a minimal slope when the value is optimal (i.e. e.g. (x-4.).abs() will not work, since it’s slope is constant and then changes sign)
  • The models (functions) we can use regression to optimize for.
  • olsols
    Ordinary least squares implementation.
  • random_subset_regressionrandom_subset_regression
    Improves speed of regression by only taking a few points into account.
  • Spiral estimator, a robust sampling estimator. This should be more robust than theil_sen.
  • Theil-Sen estimator, a robust linear (also implemented as polynomial) estimator. Up to ~27% of values can be outliers - erroneous data far from the otherwise good data - without large effects on the result.
  • Traits and coefficients of trigonometric functions.

Structs

Traits

Functions