Fourier
Fourier offers an consistent interface for Fourrier Transform
libraries on 1, 2 and 3-dimensional Matrices. Its behaviour is consistent
of that of the famous FFTW library for
Fast Fourier Transforms, even when other libraries are used.
Currently, Fourrier supports two libraries: MIT's FFTW and NVidia's
GPU-accelerated CUFFT.
Regardless of what library is used, the output is guaranteed to be compatible
with FFTW's
specifications.
Fourier can be used with either Matrix, boost's
multi_array,
or with Blitz++'s
Array through BlitzArray.
There are two possible styles of using the frontend; through a function which
executes a single transform, or through the creation of a so-called plan
in FFTW-style that can be executed multiple times. The latter allows greater
control over the computation.
Note: When using real-to-complex transforms, the last dimension of the
data must be even!
Compatibility with FFTW and CUFFT
The frontend in Fourier makes use of templates, but neither the FFTW library nor CUFFT
do. The FFTW, by default, assumes the use of doubles. By
linking the correct libraries, also floats and long doubles are
supported. The CUFFT requires linking with its libraries, which supports both
floats and doubles, provided that your hardware supports this.
Parallelism in Detail - FFTW only
This section applies uniquely to the FTTW, as CUFFT is inherently parallelized.
Multi-threaded execution within the FFTW can be enabled by
defining USE_THREADS. By default FFTW uses the systems' thread
libraries, for example pthreads. Alternatively, the FFTW supports OpenMP
natively, provided that the libraries are to be compiled with support for it.
Multi-threading using
OpenMP in the Fourier frontend can
be enabled by appropriate compiler switches, that also enable the use of the
OMPTL in the code. The FFTW is in
itself not thread-safe, but Fourier handles the appropriate protection
of critical sections when the program is parallelized using OpenMP.
It is recommended to use OpenMP and make sure the FFTW is compiled with support
for OpenMP, see the
relevant documentation.
For parallelism, i.e. if you have defined USE_THREADS, you
need to additionally link in the appropriate versions of the thread libraries of
FFTW. If FFTW was compiled without OpenMP, i.e. configured
without --with-openmp, you need to additionally link to the thread
library of your system, for example
Pthread.
Linking
Linking with the FFTW
By default, linking with the fftw is required:
-lfftw3
If, FFT's on floats are used, the float version of
the FFTW is required:
-lfftw3f
Similar for long double:
-lfftw3l
Linking with multi-threaded FFTW compiled without OpenMP support
The following compiler-flags for gcc would be required for a program
using both floats and long doubles and multi-threading using
native threads:
-DUSE_THREADS
The corresponding linker-flags are:
-lfftw3 -lfftw3f -lfftw3l -lfftw3f_threads -lfftw3_threads -lfftw3l_threads
-lpthread
Note that you may still compile your own program with OpenMP.
Linking with multi-threaded FFTW compiled with OpenMP support
The following compiler-flags for gcc would be required for a program
using both floats and long doubles and multi-threading using only
OpenMP:
-DUSE_THREADS
The corresponding linker-flags are:
-lfftw3f -lfftw3 -lfftw3l -lfftw3f_threads -lfftw3_threads -lfftw3l_threads
Linking with CUFFT
When USE_CUFFT is defined during compilation, linking with the cufft and cuda
runtime libraries is required:
-lcufft -lcudart
Also do not forget to inform your compiler of the link paths to these libraries if needed.
Interface
Note: the typename matrix_type is used to denote either one of
the accepted matrix types.
Convenient Interface
The convenient interface offers a single funtion that can handle transforms of
arrays of upto three dimensions.
Note that the inverse transform is normalized!
bool doDFT(const ArrayIn<Tin, N> &in,
ArrayOut<Tout, N> &out,
bool forward = true,
unsigned threads = 1u) |
Computes the Discrete Fourier Transform of the data in in,
and stores the result in out. If forward is true,
the forward transform is computed, otherwise the inverse transform. The
inverse transform is normalized. The number of threads is only relevant
if USE_THREADS is defined, otherwise it is ignored.
|
Detailed Interface
It is possible to have more control over the processing by using the parts of
the interface that mimic the FFTW in greater detail. This is particularly
interesting if you need to compute many transforms of the same type, i.e.
the same dimensionality, size, et cetera.
The primary interface to emulate the concept of creating plans and executing
them is through a template class DFT:
template <typename float_type, std::size_t N>
class DFT;
Any instanciated class DFT<float_type, N> offers the following
interface:
DFT(ArrayIn<std::complex<float_type>, N, Aux> &in,
ArrayOut<std::complex<float_type>, N, Aux> &out,
bool forward = true, unsigned flags = FFTW_ESTIMATE,
unsigned threads = 0u) |
Constructor. Takes as input a matrix in, as output a
matrix out. By default, when forward is true, a forward
transform is planned, otherwise an inverse transform is planned.
Flags can be passed to the FFTW by the parameter flags. The
planning will be done for with the specified number of threads.
If threads is zero (the default), a default number of threads
will be used.
|
DFT(ArrayIn<float_type, N, Aux> &in,
ArrayOut<std::complex<T>, N, Aux> &out,
unsigned flags = FFTW_ESTIMATE, unsigned threads = 0u) |
Constructor. Takes as input a matrix in, as output a
matrix out. By default, when forward is true, a forward
transform is planned, otherwise an inverse transform is planned.
Flags can be passed to the FFTW by the parameter flags. The
planning will be done for with the specified number of threads.
If threads is zero (the default), a default number of threads
will be used.
|
DFT(ArrayIn<std::complex<float_type>, N, Aux> &in,
ArrayOut<float_type, N, Aux> &out,
bool forward = true, unsigned flags = FFTW_ESTIMATE,
unsigned threads = 0u) |
Constructor. Takes as input a matrix in, as output a
matrix out. By default, when forward is true, a forward
transform is planned, otherwise an inverse transform is planned.
Flags can be passed to the FFTW by the parameter flags. The
planning will be done for with the specified number of threads.
If threads is zero (the default), a default number of threads
will be used.
|
void execute() |
Execute the planned transform.
|
void execute(std::string wisdom) |
Execute the planned transform using wisdom from the file wisdom.
|
static unsigned optimalSize(const unsigned ext) |
Given a size of a dimension of a matrix, returns a suggestion
for a size, possibly larger, that might give better performance.
|
Example
The data-layout of real-to-complex transforms and vice versa is akin to that of
the FFTW, as explained in the
manual. The following example shows how to print data as it would appear
when using Matlab's fft2 command.
A real-to-complex transform:
#include <iostream>
#include <cvmlcpp/base/Enums>
#include <cvmlcpp/base/Matrix>
#include <cvmlcpp/signal/Fourier>
double data [] = {
-0.144892, -0.0888301, 0.00954482, 0, 0,
0.00130495, -0.125482, -0.0517049, 0, 0,
-0.0155715, 0.209789, 0.297849, 0, 0,
0, 0, 0, 0, 0};
const std::size_t size = 20;
using namespace cvmlcpp;
int main()
{
// Create a Matrix with data
std::size_t dims [] = {5, 4}; // last dimension must be even
Matrix<double, 2> in(dims, data, data+size);
// Print the data Matrix
for (unsigned x = 0; x < in.extent(X); ++x)
{
for (unsigned y = 0; y < in.extent(Y); ++y)
std::cout << in[x][y] << " ";
std::cout << std::endl;
}
std::cout << std::endl;
// Create an empty array. It will automatically be resized.
Matrix<std::complex<double>, 2> out;
// The Fourier Transform
doDFT(in, out);
// Print the Complex Matrix
for (unsigned x = 0; x < out.extent(X); ++x)
{
/*
* The Complex Matrix contains almost half of the elements
* of the original array. The Nth-1 dimension has shrunk from
* 'k' to 'k/2+1'. The "missing" elements are simply the
* complex conjugated of the given elements.
*/
for (unsigned y = 0; y < out.extent(Y); ++y)
std::cout << out[x][y] << " ";
// Print the "missing" elements.
for (unsigned y = 1; y < out.extent(Y); ++y)
std::cout << std::conj(out[x][out.extent(Y)-y]) << " ";
std::cout << std::endl;
}
std::cout << std::endl;
return 0;
}
|