Various regression models to fit the best line to your data. All written to be understandable.
- Predictors - the independent values (usually denoted
x) from which we want a equation to get the:
- outcomes - the dependant variables. Usually
- model - create an equation which optimally (can optimize for different priorities) fits the data.
Linear regressions are often used by other regression methods. All linear regressions therefore
LinearEstimator trait. You can use the
*Linear structs to choose which method to
Details and comments on implementation can be found as docs under each item.
derived for the implementations.
I reverse the exponentiation to get a linear model. Then, I solve it using the method linked above. Then, I transform the returned variables to fit the target model.
This is not very good, as the errors of large values are reduced compared to small values when taking the logarithm. I have plans to address this bias in the future. The current behaviour is however still probably the desired behaviour, as small values are often relatively important to larger.
Many programs (including LibreOffice Calc) simply discards negative & zero values. I chose to go the explicit route and add additional terms to satisfy requirements. This is naturally a fallback, and should be a warning sign your data is bad.
Under these methods the calculations are inserted, and how to handle the data.
pub use binary_search::Options as BinarySearchOptions;
pub use derived::exponential_ols;
pub use derived::power_ols;
pub use gradient_descent::ParallelOptions as GradientDescentParallelOptions;
pub use gradient_descent::SimultaneousOptions as GradientDescentSimultaneousOptions;
pub use ols::OlsEstimator;
pub use spiral::SpiralLinear;
pub use spiral::SpiralLogisticWithCeiling;
pub use theil_sen::LinearTheilSen;
pub use theil_sen::PolynomialTheilSen;
pub use trig::*;
- A random binary searching n-variable optimizer.
- Estimators derived from others, usual
- Assumes the fitness function has a minimal slope when the value is optimal (i.e. e.g.
(x-4.).abs()will not work, since it’s slope is constant and then changes sign)
- The models (functions) we can use regression to optimize for.
olsOrdinary least squares implementation.
random_subset_regressionImproves speed of regression by only taking a few points into account.
- Spiral estimator, a robust sampling estimator. This should be more robust than
- Theil-Sen estimator, a robust linear (also implemented as polynomial) estimator. Up to ~27% of values can be outliers - erroneous data far from the otherwise good data - without large effects on the result.
- Traits and coefficients of trigonometric functions.
- Generic model. This enables easily handling results from several models.
- The coefficients of a exponential function (
- The coefficients of a line.
- The coefficients of a logistic function.
- The length of the inner vector is
degree + 1.
- The coefficients of a power (also called growth) function (
- Helper trait to make the R² method take a generic iterator.
- Implemented by all estimators yielding an exponential regression.
- Implemented by all estimators yielding a linear 2 variable regression (a line).
- Implemented by all estimators yielding an logistic regression.
- Implemented by all estimators yielding a polynomial regression.
- Implemented by all estimators yielding a power regression.
- Something that can predict the outcome from a predictor.
- Finds the model best fit to the input data. This is done using heuristics and testing of methods.