# Module std_dev::regression::spiral

source · **crate feature**only.

`regression`

## Expand description

Spiral estimator, a robust sampling estimator.
This should be more robust than `theil_sen`

.

This is a brainchild of this library’s lead developer Icelk.

The `spiral::Options`

implement most of the estimators.

## Advantages

You supply a `fitness_function`

to all functions which tells the algorithm which lines are
good. The magnitude is irrelevant, only order is considered. The algorithm tries to *minimize*
the returned value. **This allows you to choose the desired properties of resulting
line/polynomial, without checking all possible values.**

## Caveats

The polynomial regression implementation only allows for degrees 1 & 2. See details for more info on this.

The sampling technique means this might miss the right point to close in on. Therefore, I highly recommend using the higher quality options.

### Robustness

Since this uses a fitness function, the robustness is determined by that. Using the “default”
`manhattan_distance`

gives good results (think least squares, but without the squared
importance of errors). This is what the implementations for `spiral::Options`

does.

Since this tests a wide range of possibilities before deciding on one, it’s very likely we don’t get trapped in a local maxima.

## Performance

The functions are `O(fitness function)`

where `O(fitness function)`

is the time
complexity of your `fitness_function`

. That’s often `O(n)`

as you’d probably in some way
sum up the points relative to the model.

This puts the algorithm similar to `ols`

, but with much worse (read: 4x-100x) performance.
This may be justified by the advantages.
It scales much better than `theil_sen`

and is more robust, but when the count of points is
small, `theil_sen`

is faster.

## Details

The idea is to make a phase space
of the parameters to a line (`y=(slope)*x + (y-intersect)`

). We then traverse the phase space
with a logarithmic spiral
and sample points (we start at the angle θ e.g. `-12π`

and go to a max value, e.g. `12π`

)
on an interval. When the range of the spiral has been sampled, we choose the best point and
create a spiral there. Depending on how far out the new point, scale the whole spiral’s size
(by `distance.sqrt().sqrt()`

). Then repeat.

Parameters are chosen for an optimal spiral. The logarithmic spiral was chosen due to the distribution of unknown numbers (which the coefficients of the line are). There’s generally more numbers in the range 0..100 than in 100..200. Therefore, we search more numbers in 0..100.

We can do this in 3 dimensions by creating a 3d spiral. For this, I used a
spherical spiral
where the radius grows with the angle of the spiral, calculated by `e^(θk)`

where `k`

is a
parameter, similar to how a logarithmic spiral is created.

On implementing third degree polynomials, can we get a spiral on a hypersphere? Or do we just need a even distribution?

See `spiral::Options`

for more info on the parameters.

## Structs

- Options for the spiral.
`LinearEstimator`

for the spiral estimator using a fitness function and`Options`

provided by you.`O(fitness function)`

- Implements
`LogisticEstimator`

with a known ceiling for the input values. Uses manhattan distance as the fitness function.

## Functions

- Samples points on a spherical spiral in the phase space of all second degree polynomials. As θ (the angle) increases, the imaginary sphere’s size is increased. This gives a good distribution of sample points in 3d space.
- Samples points on a logarithmic spiral in the phase space of all possible straight lines.