ls_fft description:

This package is intended to calculate one-dimensional real or complex FFTs
with high accuracy and good efficiency even for lengths containing large
prime factors.
The code is written in C, but a Fortran wrapper exists as well.

Before any FFT is executed, a plan must be generated for it. Plan creation
is designed to be fast, so that there is no significant overhead if the
plan is only used once or a few times.

The main component of the code is a C port of Swarztrauber's FFTPACK
(http://www.netlib.org/fftpack/), which was originally done by Pekka Janhunen
and reformatted by Joerg Arndt.

I added a few digits to the floating-point constants to achieve higher
precision, split the complex transforms into separate routines for forward
and backward transforms (to increase performance a bit), and replaced
the iterative twiddling factor calculation in radfg() and radbg() by an exact
calculation of the factors.

Since FFTPACK becomes quite slow for FFT lengths with large prime factors
(in the worst case of prime lengths it reaches O(n*n) complexity), I
implemented Bluestein's algorithm, which computes a FFT of length n by
several FFTs of length n2>=2*n and a convolution. Since n2 can be chosen
to be highly composite, this algorithm is more efficient if n has large
prime factors. The longer FFTs themselves are then computed using the FFTPACK
routines.
Bluestein's algorithm was implemented according to the description at
http://en.wikipedia.org/wiki/Bluestein's_FFT_algorithm.

Thread-safety:
All routines can be called concurrently; all information needed by ls_fft
is stored in the plan variable. However, using the same plan variable on
multiple threads simultaneously is not supported and will lead to data
corruption.
