Fourier

Fourier offers an consistent interface for Fourrier Transform libraries on 1, 2 and 3-dimensional Matrices. Its behaviour is consistent of that of the famous FFTW library for Fast Fourier Transforms, even when other libraries are used.

Currently, Fourrier supports two libraries: MIT's FFTW and NVidia's GPU-accelerated CUFFT. Regardless of what library is used, the output is guaranteed to be compatible with FFTW's specifications.

Fourier can be used with either Matrix, boost's multi_array, or with Blitz++'s Array through BlitzArray.

There are two possible styles of using the frontend; through a function which executes a single transform, or through the creation of a so-called plan in FFTW-style that can be executed multiple times. The latter allows greater control over the computation.

Note: When using real-to-complex transforms, the last dimension of the data must be even!

Compatibility with FFTW and CUFFT

The frontend in Fourier makes use of templates, but neither the FFTW library nor CUFFT do. The FFTW, by default, assumes the use of doubles. By linking the correct libraries, also floats and long doubles are supported. The CUFFT requires linking with its libraries, which supports both floats and doubles, provided that your hardware supports this.

Parallelism in Detail - FFTW only

This section applies uniquely to the FTTW, as CUFFT is inherently parallelized.

Multi-threaded execution within the FFTW can be enabled by defining USE_THREADS. By default FFTW uses the systems' thread libraries, for example pthreads. Alternatively, the FFTW supports OpenMP natively, provided that the libraries are to be compiled with support for it.

Multi-threading using OpenMP in the Fourier frontend can be enabled by appropriate compiler switches, that also enable the use of the OMPTL in the code. The FFTW is in itself not thread-safe, but Fourier handles the appropriate protection of critical sections when the program is parallelized using OpenMP.

It is recommended to use OpenMP and make sure the FFTW is compiled with support for OpenMP, see the relevant documentation.

For parallelism, i.e. if you have defined USE_THREADS, you need to additionally link in the appropriate versions of the thread libraries of FFTW. If FFTW was compiled without OpenMP, i.e. configured without --with-openmp, you need to additionally link to the thread library of your system, for example Pthread.

Linking

Linking with the FFTW

By default, linking with the fftw is required:

-lfftw3

If, FFT's on floats are used, the float version of the FFTW is required:

-lfftw3f
Similar for long double:
-lfftw3l

Linking with multi-threaded FFTW compiled without OpenMP support

The following compiler-flags for gcc would be required for a program using both floats and long doubles and multi-threading using native threads:

-DUSE_THREADS
The corresponding linker-flags are:
-lfftw3 -lfftw3f -lfftw3l -lfftw3f_threads -lfftw3_threads -lfftw3l_threads
-lpthread
Note that you may still compile your own program with OpenMP.

Linking with multi-threaded FFTW compiled with OpenMP support

The following compiler-flags for gcc would be required for a program using both floats and long doubles and multi-threading using only OpenMP:
-DUSE_THREADS
The corresponding linker-flags are:
-lfftw3f -lfftw3 -lfftw3l -lfftw3f_threads -lfftw3_threads -lfftw3l_threads

Linking with CUFFT

When USE_CUFFT is defined during compilation, linking with the cufft and cuda runtime libraries is required:

-lcufft -lcudart
Also do not forget to inform your compiler of the link paths to these libraries if needed.

Interface

Note: the typename matrix_type is used to denote either one of the accepted matrix types.

Convenient Interface

The convenient interface offers a single funtion that can handle transforms of arrays of upto three dimensions.
Note that the inverse transform is normalized!

  bool doDFT(const ArrayIn<Tin, N> &in,    
	     ArrayOut<Tout, N> &out,
	     bool forward = true,
	     unsigned threads = 1u)  
Computes the Discrete Fourier Transform of the data in in, and stores the result in out. If forward is true, the forward transform is computed, otherwise the inverse transform. The inverse transform is normalized. The number of threads is only relevant if USE_THREADS is defined, otherwise it is ignored.

Detailed Interface

It is possible to have more control over the processing by using the parts of the interface that mimic the FFTW in greater detail. This is particularly interesting if you need to compute many transforms of the same type, i.e. the same dimensionality, size, et cetera.

The primary interface to emulate the concept of creating plans and executing them is through a template class DFT:

template <typename float_type, std::size_t N>
class DFT;
Any instanciated class DFT<float_type, N> offers the following interface:
  DFT(ArrayIn<std::complex<float_type>, N, Aux> &in, 
      ArrayOut<std::complex<float_type>, N, Aux> &out,
      bool forward = true, unsigned flags = FFTW_ESTIMATE,
      unsigned threads = 0u)  
Constructor. Takes as input a matrix in, as output a matrix out. By default, when forward is true, a forward transform is planned, otherwise an inverse transform is planned. Flags can be passed to the FFTW by the parameter flags. The planning will be done for with the specified number of threads. If threads is zero (the default), a default number of threads will be used.
  DFT(ArrayIn<float_type, N, Aux> &in, 
      ArrayOut<std::complex<T>, N, Aux> &out,
      unsigned flags = FFTW_ESTIMATE, unsigned threads = 0u)  
Constructor. Takes as input a matrix in, as output a matrix out. By default, when forward is true, a forward transform is planned, otherwise an inverse transform is planned. Flags can be passed to the FFTW by the parameter flags. The planning will be done for with the specified number of threads. If threads is zero (the default), a default number of threads will be used.
  DFT(ArrayIn<std::complex<float_type>, N, Aux> &in,
      ArrayOut<float_type, N, Aux> &out,
      bool forward = true, unsigned flags = FFTW_ESTIMATE,
      unsigned threads = 0u)  
Constructor. Takes as input a matrix in, as output a matrix out. By default, when forward is true, a forward transform is planned, otherwise an inverse transform is planned. Flags can be passed to the FFTW by the parameter flags. The planning will be done for with the specified number of threads. If threads is zero (the default), a default number of threads will be used.
  void execute()  
Execute the planned transform.
  void execute(std::string wisdom)  
Execute the planned transform using wisdom from the file wisdom.
  static unsigned optimalSize(const unsigned ext)  
Given a size of a dimension of a matrix, returns a suggestion for a size, possibly larger, that might give better performance.

Example

The data-layout of real-to-complex transforms and vice versa is akin to that of the FFTW, as explained in the manual. The following example shows how to print data as it would appear when using Matlab's fft2 command.

A real-to-complex transform:
#include <iostream>

#include <cvmlcpp/base/Enums>
#include <cvmlcpp/base/Matrix>
#include <cvmlcpp/signal/Fourier>

double data [] = {
-0.144892,  -0.0888301,  0.00954482, 0,  0,
0.00130495,  -0.125482,  -0.0517049, 0,  0,
-0.0155715,  0.209789,  0.297849, 0,  0,
0,  0,  0,  0,  0};

const std::size_t size = 20;

using namespace cvmlcpp;

int main()
{
	// Create a Matrix with data
	std::size_t dims [] = {5, 4}; // last dimension must be even
	Matrix<double, 2> in(dims, data, data+size);

	// Print the data Matrix
	for (unsigned x = 0; x < in.extent(X); ++x)
	{
		for (unsigned y = 0; y < in.extent(Y); ++y)
			std::cout << in[x][y] << "   ";
		std::cout << std::endl;
	}
	std::cout << std::endl;

	// Create an empty array. It will automatically be resized.
	Matrix<std::complex<double>, 2> out;

	// The Fourier Transform
	doDFT(in, out);

	// Print the Complex Matrix
	for (unsigned x = 0; x < out.extent(X); ++x)
	{
		/*
		 * The Complex Matrix contains almost half of the elements
		 * of the original array. The Nth-1 dimension has shrunk from
		 * 'k' to 'k/2+1'. The "missing" elements are simply the
		 * complex conjugated of the given elements.
		 */
		for (unsigned y = 0; y < out.extent(Y); ++y)
			std::cout << out[x][y] << "   ";

		// Print the "missing" elements.
		for (unsigned y = 1; y < out.extent(Y); ++y)
			std::cout << std::conj(out[x][out.extent(Y)-y]) << " ";
		std::cout << std::endl;
	}
	std::cout << std::endl;

	return 0;
}