comp-avg: Compare Averages of time series and more
Description
A light-weight tool to get the most important info from a time series and to COMPare the AVeraGe of different time series.
Get the average, standard deviation, standard error, integrated autocorrelation time and corresponding errors quickly and reliably with a simple bash command.
For questions concerning the code contact ostmeyer@hiskp.uni-bonn.de.
Installation
The entire analysis is implemented in C with a bash front-end.
After downloading the code, enter the newly created directory and type
make
to create the executable. The tool is now ready to be used. It is recommended, however, to copy the two files comp-avg and summary into a directory that is part of your environment $PATH, e.g. ~/bin/.
The default code depends on FFTW3. If you do not have FFTW3 installed and cannot install it easily, compile with
make MYFFT=-DSTANDALONE
instead. This will use an integrated implementation for the FFTs. They are slower than those in FFTW3, but typically the difference is not important.
Usage
A detailed description of the options and results is displayed on (see last section for output):
comp-avg -h
There are four ways in which comp-avg can be used. In all cases it accepts any number of numeric values (scientific notation is supported) separated by arbitrary white space characters (unless otherwise specified with the -d option). In a file with several columns you can select one of them using -f. See included foo.txt and bar.txt files containing random numbers for examples.
1. Pipe the time series into comp-avg, e.g.
echo {1..1000} | comp-avg -v
# lines mean std. dev error t_int t_max t_int_err err_err
# 1000 500.5 288.819 130.657 102.427 115 36.8677 23.5144
2. Analyse a single file (e.g. containing three columns)
comp-avg -vt3 foo.txt
# index mean std. dev error t_int t_max t_int_err err_err
# 0 -0.0586172 1.01318 0.0840894 0.347892 1 0.0758117 0.00916228
# 1 -0.104779 1.01967 0.10424 0.527818 1 0.117064 0.0115596
# 2 0.0584949 0.928331 0.092652 0.503082 1 0.110622 0.0101865
3. Compare two files (e.g. the second columns of both)
comp-avg -vf2 foo.txt bar.txt
# lines mean std. dev error t_int t_max t_int_err err_err file
# 100 -0.104779 1.01967 0.10424 0.527818 1 0.117064 0.0115596 foo.txt
# 200 0.0550419 0.946643 0.0669586 0.502825 1 0.0811432 0.00540271 bar.txt
# 1.28 sigma deviation.
Here the relative deviation |mu1-mu2|/sqrt(Delta1^2+Delta2^2) is quoted (with means mu1,mu2 and errors Delta1,Delta2) assuming both time series are uncorrelated.
4. Print the full autocorrelation function, e.g.
echo {1..4} |comp-avg -anm 0
# 1
# 0.888889
# 0.733333
# 0.533333
Why an FFT?
The autocorrelation function required for a reliable estimate of the error on the average via the integrated autocorrelation time can be estimated time-slice by time-slice (done by comp-avg -n). If autocorrelations are long and O(n) time slices are needed, this leads to a runtime in O(n^2). This is where the Fast Fourier Transformation (FFT) comes into play. It allows to calculate the entire autocorrelation function in O(n log(n)) runtime.
The FFT-based autocorrelation calculation is the default because it guarantees runtime in O(n log(n)). For time series with short autocorrelation this might not be the best option and comp-avg -n could be faster, ideally in O(n).
Note also that the FFT assumes periodic boundary conditions. In most cases this introduces a very negligible error, but technically it is not correct. Of course, comp-avg -n does not have this problem.
Reference
All the algorithms used here are explained in:
U. Wolff, "Monte Carlo errors with less errors", Computer Physics Communications 156, 143–153 (2004).
Stable Releases
v2.1.1 first version made publicly available with reasonably comprehensive documentation.
v2.2.0 introduced option to print the full autocorrelation function.
v2.2.1 introduced option to add the true mean.
Help Message
Calculate most important statistical information from one or more time-series:
n, mu, sd, err, t_int, t_max, t_int_err, err_err
n: No. of lines
mu: mean
sd: standard deviation
err: standard error
t_int: integrated autocorrelation time
t_max: stopping time for summation of autocorrelation
t_int_err: error on tau_int
err_err: error on the error
The time series can be provided as a file or piped from STDIN.
If two files are provided, results are calculated for both and the relative deviation is given.
-e: using only every ...-th line
-s: summing autocovariance only up to first zero-crossing, default is Ulli Wolff method
-n: no FFT, calculate covariance time slice by time slice instead
-f: use ...-th field/column only
-d: separate fields by delimiter ..., default is tab
-p: output with machine precision, default is 5 digits
-t: number of observables, input has to contain one column of each observable, n replaced by index (0..) of observable
-y: symmetrise, only if '-t' option is used for correlator type data
-a: print full autocorrelation function instead of standard output
-m: add true mean ... if known
-v: verbose, display header line
-h: help, this message
Files
j-ostmeyer/comp-avg-v2.2.1.zip
Files
(16.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:6a841d57860a206e770de69575d9b519
|
16.8 kB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/j-ostmeyer/comp-avg/tree/v2.2.1 (URL)