Published August 29, 2024 | Version v2.2.1
Software Open

comp-avg: Compare Averages of time series and more

Authors/Creators

  • 1. ROR icon University of Bonn

Description

A light-weight tool to get the most important info from a time series and to COMPare the AVeraGe of different time series.

Get the average, standard deviation, standard error, integrated autocorrelation time and corresponding errors quickly and reliably with a simple bash command.

For questions concerning the code contact ostmeyer@hiskp.uni-bonn.de.

Installation

The entire analysis is implemented in C with a bash front-end.

After downloading the code, enter the newly created directory and type

make

to create the executable. The tool is now ready to be used. It is recommended, however, to copy the two files comp-avg and summary into a directory that is part of your environment $PATH, e.g. ~/bin/.

The default code depends on FFTW3. If you do not have FFTW3 installed and cannot install it easily, compile with

make MYFFT=-DSTANDALONE

instead. This will use an integrated implementation for the FFTs. They are slower than those in FFTW3, but typically the difference is not important.

Usage

A detailed description of the options and results is displayed on (see last section for output):

comp-avg -h

There are four ways in which comp-avg can be used. In all cases it accepts any number of numeric values (scientific notation is supported) separated by arbitrary white space characters (unless otherwise specified with the -d option). In a file with several columns you can select one of them using -f. See included foo.txt and bar.txt files containing random numbers for examples.

1. Pipe the time series into comp-avg, e.g.

echo {1..1000} | comp-avg -v
# lines   mean    std. dev    error      t_int      t_max    t_int_err    err_err
# 1000    500.5   288.819     130.657    102.427    115      36.8677      23.5144

2. Analyse a single file (e.g. containing three columns)

comp-avg -vt3 foo.txt
# index   mean          std. dev    error        t_int       t_max    t_int_err    err_err
# 0       -0.0586172    1.01318     0.0840894    0.347892    1        0.0758117    0.00916228
# 1       -0.104779     1.01967     0.10424      0.527818    1        0.117064     0.0115596
# 2       0.0584949     0.928331    0.092652     0.503082    1        0.110622     0.0101865

3. Compare two files (e.g. the second columns of both)

comp-avg -vf2 foo.txt bar.txt
# lines   mean         std. dev    error        t_int       t_max    t_int_err    err_err       file
# 100     -0.104779    1.01967     0.10424      0.527818    1        0.117064     0.0115596     foo.txt
# 200     0.0550419    0.946643    0.0669586    0.502825    1        0.0811432    0.00540271    bar.txt
# 1.28 sigma deviation.

Here the relative deviation |mu1-mu2|/sqrt(Delta1^2+Delta2^2) is quoted (with means mu1,mu2 and errors Delta1,Delta2) assuming both time series are uncorrelated.

4. Print the full autocorrelation function, e.g.

echo {1..4} |comp-avg -anm 0
# 1
# 0.888889
# 0.733333
# 0.533333

Why an FFT?

The autocorrelation function required for a reliable estimate of the error on the average via the integrated autocorrelation time can be estimated time-slice by time-slice (done by comp-avg -n). If autocorrelations are long and O(n) time slices are needed, this leads to a runtime in O(n^2). This is where the Fast Fourier Transformation (FFT) comes into play. It allows to calculate the entire autocorrelation function in O(n log(n)) runtime.

The FFT-based autocorrelation calculation is the default because it guarantees runtime in O(n log(n)). For time series with short autocorrelation this might not be the best option and comp-avg -n could be faster, ideally in O(n).

Note also that the FFT assumes periodic boundary conditions. In most cases this introduces a very negligible error, but technically it is not correct. Of course, comp-avg -n does not have this problem.

Reference

All the algorithms used here are explained in:

U. Wolff, "Monte Carlo errors with less errors", Computer Physics Communications 156, 143–153 (2004).

Stable Releases

v2.1.1 first version made publicly available with reasonably comprehensive documentation.

v2.2.0 introduced option to print the full autocorrelation function.

v2.2.1 introduced option to add the true mean.

Help Message

Calculate most important statistical information from one or more time-series:
 n, mu, sd, err, t_int, t_max, t_int_err, err_err

 n:             No. of lines
 mu:            mean
 sd:            standard deviation
 err:           standard error
 t_int:         integrated autocorrelation time
 t_max:         stopping time for summation of autocorrelation
 t_int_err:     error on tau_int
 err_err:       error on the error

 The time series can be provided as a file or piped from STDIN.
 If two files are provided, results are calculated for both and the relative deviation is given.

 -e: using only every ...-th line
 -s: summing autocovariance only up to first zero-crossing, default is Ulli Wolff method
 -n: no FFT, calculate covariance time slice by time slice instead
 -f: use ...-th field/column only
 -d: separate fields by delimiter ..., default is tab
 -p: output with machine precision, default is 5 digits
 -t: number of observables, input has to contain one column of each observable, n replaced by index (0..) of observable
 -y: symmetrise, only if '-t' option is used for correlator type data
 -a: print full autocorrelation function instead of standard output
 -m: add true mean ... if known
 -v: verbose, display header line
 -h: help, this message

Files

j-ostmeyer/comp-avg-v2.2.1.zip

Files (16.8 kB)

Name Size Download all
md5:6a841d57860a206e770de69575d9b519
16.8 kB Preview Download

Additional details

Related works