gvar - Gaussian Random Variables¶
Introduction¶
Objects of type gvar.GVar represent gaussian random variables,
which are specified by a mean and standard deviation. They are created
using gvar.gvar(): for example,
>>> x = gvar.gvar(10,3) # 0 +- 3
>>> y = gvar.gvar(12,4) # 2 +- 4
>>> z = x + y # 2 +- 5
>>> print(z)
22.0(5.0)
>>> print(z.mean)
22.0
>>> print(z.sdev)
5.0
This module contains a variety of tools for creating and manipulating gaussian random variables, including:
mean(g)— extract means.
sdev(g)— extract standard deviations.
var(g)— extract variances.
fmt(g)— replace allgvar.GVars in array/dictionary by string representations.
tabulate(g)— tabulate entries in array/dictionary ofgvar.GVars.
correlate(g, corr)— add correlations togvar.GVars in array/dictionaryg.
chi2(g1, g2)—chi**2ofg1-g2.
equivalent(g1, g2)—gvar.GVars the same ing1andg2?
evalcov(g)— compute covariance matrix.
evalcov_blocks(g)— compute diagonal blocks of covariance matrix.
evalcorr(g)— compute correlation matrix.
fmt_values(g)— list values for printing.
fmt_errorbudget(g)— create error-budget table for printing.
fmt_chi2(f)— format chi**2 information in f as string for printing.class
BufferDict— ordered dictionary with data buffer.class
class
PDFStatistics— statistical analysis of moments of a random variable.class
PDFHistogram— tool for building PDF histograms.
dump(g, outputfile)— serialize a collection ofgvar.GVars in file.
dumps(g)— serialize a collection ofgvar.GVars in a string.
load(inputfile)— reconstitute a collection ofgvar.GVars from a file.
loads(inputstr)— reconstitute a collection ofgvar.GVars from a string.
disassemble(g)— low-level routine to disassemble a collection ofgvar.GVars.
reassemble(data,cov)— low-level routine to reassemble a collection ofgvar.GVars.
raniter(g,N)— iterator for random numbers.
ranseed(seed)— seed random number generator.
bootstrap_iter(g,N)— bootstrap iterator.
svd(g, svdcut)— SVD modification of correlation matrix.
dataset.bin_data(data)— bin random sample data.
dataset.avg_data(data)— estimate means of random sample data.
dataset.bootstrap_iter(data,N)— bootstrap random sample data.class
dataset.Dataset— class for collecting random sample data.
Functions¶
The function used to create Gaussian variable objects is:
-
gvar.gvar(...)¶ Create one or more new
gvar.GVars.Each of the following creates new
gvar.GVars:-
gvar.gvar(x, xsdev) Returns a
gvar.GVarwith meanxand standard deviationxsdev. Returns an array ofgvar.GVars ifxandxsdevare arrays with the same shape; the shape of the result is the same as the shape ofx. Returns agvar.BufferDictifxandxsdevare dictionaries with the same keys and layout; the result has the same keys and layout asx.
-
gvar.gvar(x, xcov) Returns an array of
gvar.GVars with means given by arrayxand a covariance matrix given by arrayxcov, wherexcov.shape = 2*x.shape; the result has the same shape asx. Returns agvar.BufferDictifxandxcovare dictionaries, where the keys inxcovare(k1,k2)for any keysk1andk2inx. Returns a singlegvar.GVarifxis a number andxcovis a one-by-one matrix. The layout forxcovis compatible with that produced bygvar.evalcov()for a singlegvar.GVar, an array ofgvar.GVars, or a dictionary whose values aregvar.GVars and/or arrays ofgvar.GVars. Thereforegvar.gvar(gvar.mean(g), gvar.evalcov(g))createsgvar.GVars with the same means and covariance matrix as thegvar.GVars ingprovidedgis a singlegvar.GVar, or an array or dictionary ofgvar.GVars.
-
gvar.gvar((x, xsdev)) Returns a
gvar.GVarwith meanxand standard deviationxsdev.
-
gvar.gvar(xstr) Returns a
gvar.GVarcorresponding to stringxstrwhich is either of the form"xmean +- xsdev"or"x(xerr)"(seeGVar.fmt()).
-
gvar.gvar(xgvar) Returns
gvar.GVarxgvarunchanged.
-
gvar.gvar(xdict) Returns a dictionary (
BufferDict)bwhereb[k] = gvar(xdict[k])for every key in dictionaryxdict. The values inxdict, therefore, can be strings, tuples orgvar.GVars (see above), or arrays of these.
-
gvar.gvar(xarray) Returns an array
ahaving the same shape asxarraywhere every elementa[i...] = gvar(xarray[i...]). The values inxarray, therefore, can be strings, tuples orgvar.GVars (see above).
gvar.gvaris actually an object of typegvar.GVarFactory.-
The following function is useful for constructing new functions that
can accept gvar.GVars as arguments:
-
gvar.gvar_function(x, f, dfdx)¶ Create a
gvar.GVarfor function f(x) given f and df/dx at x.This function creates a
gvar.GVarcorresponding to a function ofgvar.GVarsxwhose value isfand whose derivatives with respect to eachxare given bydfdx. Herexcan be a singlegvar.GVar, an array ofgvar.GVars (for a multidimensional function), or a dictionary whose values aregvar.GVars or arrays ofgvar.GVars, whiledfdxmust be a float, an array of floats, or a dictionary whose values are floats or arrays of floats, respectively.This function is useful for creating functions that can accept
gvar.GVars as arguments. For example,import math import gvar as gv def sin(x): if isinstance(x, gv.GVar): f = math.sin(x.mean) dfdx = math.cos(x.mean) return gv.gvar_function(x, f, dfdx) else: return math.sin(x)
creates a version of
sin(x)that works with either floats orgvar.GVars as its argument. This particular function is unnecessary since it is already provided bygvar.- Parameters
- Returns
A
gvar.GVarrepresenting the function’s value atx.
Means, standard deviations, variances, formatted strings, covariance
matrices and correlation/comparison information can be extracted from arrays
(or dictionaries) of gvar.GVars using:
-
gvar.mean(g)¶ Extract means from
gvar.GVars ing.gcan be agvar.GVar, an array ofgvar.GVars, or a dictionary containinggvar.GVars or arrays ofgvar.GVars. Result has the same layout asg.Elements of
gthat are notgvar.GVars are left unchanged.
-
gvar.sdev(g)¶ Extract standard deviations from
gvar.GVars ing.gcan be agvar.GVar, an array ofgvar.GVars, or a dictionary containinggvar.GVars or arrays ofgvar.GVars. Result has the same layout asg.The deviation is set to 0.0 for elements of
gthat are notgvar.GVars.
-
gvar.var(g)¶ Extract variances from
gvar.GVars ing.gcan be agvar.GVar, an array ofgvar.GVars, or a dictionary containinggvar.GVars or arrays ofgvar.GVars. Result has the same layout asg.The variance is set to 0.0 for elements of
gthat are notgvar.GVars.
-
gvar.fmt(g, ndecimal=None, sep='')¶ Format
gvar.GVars ing.gcan be agvar.GVar, an array ofgvar.GVars, or a dictionary containinggvar.GVars or arrays ofgvar.GVars. Eachgvar.GVargiingis replaced by the string generated bygi.fmt(ndecimal,sep). Result has same structure asg.
-
gvar.tabulate(g, ncol=1, headers=True, offset='', ndecimal=None)¶ Tabulate contents of an array or dictionary of
gvar.GVars.Given an array
gofgvar.GVars or a dictionary whose values aregvar.GVars or arrays ofgvar.GVars,gvar.tabulate(g)returns a string containing a table of the values ofg’s entries. For example, the codeimport collections import gvar as gv g = collections.OrderedDict() g['scalar'] = gv.gvar('10.3(1)') g['vector'] = gv.gvar(['0.52(3)', '0.09(10)', '1.2(1)']) g['tensor'] = gv.gvar([ ['0.01(50)', '0.001(20)', '0.033(15)'], ['0.001(20)', '2.00(5)', '0.12(52)'], ['0.007(45)', '0.237(4)', '10.23(75)'], ]) print(gv.tabulate(g, ncol=2))
prints the following table:
key/index value key/index value --------------------------- --------------------------- scalar 10.30 (10) 1,0 0.001 (20) vector 0 0.520 (30) 1,1 2.000 (50) 1 0.09 (10) 1,2 0.12 (52) 2 1.20 (10) 2,0 0.007 (45) tensor 0,0 0.01 (50) 2,1 0.2370 (40) 0,1 0.001 (20) 2,2 10.23 (75) 0,2 0.033 (15)
- Parameters
g – Array of
gvar.GVars (any shape) or dictionary whose values aregvar.GVars or arrays ofgvar.GVars (any shape).ncol – The table is split over
ncolcolumns of key/index values plusgvar.GVarvalues. Default value is 1.headers – Prints standard header on table if
True; omits the header ifFalse. Ifheadersis a 2-tuple, thenheaders[0]is used in the header over the indices/keys andheaders[1]over thegvar.GVarvalues. (Default isTrue.)offset (str) – String inserted at the beginning of each line in the table. Default is
''.ndecimal – Number of digits displayed after the decimal point. Default is
ndecimal=Nonewhich adjusts table entries to show 2 digits of error.
-
gvar.correlate(g, corr)¶ Add correlations to uncorrelated
gvar.GVars ing.This method creates correlated
gvar.GVars from uncorrelatedgvar.GVarsg, using the correlations specified incorr.Note that correlations initially present in
g, if any, are ignored.Examples
A typical application involves the construction of correlated
gvar.GVars give the means and standard deviations, together with a correlation matrix:>>> import gvar as gv >>> g = gv.gvar(['1(1)', '2(10)']) >>> print(gv.evalcorr(g)) # uncorrelated [[ 1. 0.] [ 0. 1.]] >>> g = gv.correlate(g, [[1. , 0.1], [0.1, 1.]]) >>> print(gv.evalcorr(g)) # correlated [[ 1. 0.1] [ 0.1 1. ]]
This also works when
gandcorrare dictionaries:>>> g = gv.gvar(dict(a='1(1)', b='2(10)')) >>> print(gv.evalcorr(g)) {('a', 'a'): array([[ 1.]]),('a', 'b'): array([[ 0.]]),('b', 'a'): array([[ 0.]]),('b', 'b'): array([[ 1.]])} >>> corr = {} >>> corr['a', 'a'] = 1.0 >>> corr['a', 'b'] = 0.1 >>> corr['b', 'a'] = 0.1 >>> corr['b', 'b'] = 1.0 >>> g = correlate(g, corr) >>> print(gv.evalcorr(g)) {('a', 'a'): array([[ 1.]]),('a', 'b'): array([[ 0.1]]),('b', 'a'): array([[ 0.1]]),('b', 'b'): array([[ 1.]])}
-
gvar.evalcov(g)¶ Compute covariance matrix for elements of array/dictionary
g.If
gis an array ofgvar.GVars,evalcovreturns the covariance matrix as an array with shapeg.shape+g.shape. Ifgis a dictionary whose values aregvar.GVars or arrays ofgvar.GVars, the result is a doubly-indexed dictionary wherecov[k1,k2]is the covariance forg[k1]andg[k2].
-
gvar.evalcov_blocks(g)¶ Evaluate covariance matrix for elements of
g.Evaluates the covariance matrices for
gvar.GVars stored in array or dictionary of arrays/gvar.GVarsg. The covariance matrix is decomposed into its block diagonal components, and a list of tuples(idx,bcov)is returned wherebcovis a diagonal block of the covariance matrix andidxan array containing the corresponding indices ing.flatfor that block. So to reassemble the blocks into a single matrixcov, for example, one would use:import numpy as np cov = np.empty((len(g), len(g)), float) for idx, bcov in evalcov_block(g): cov[idx[:, None], idx] = bcov
gvar.evalcov_blocks()is particularly useful when the covariance matrix is sparse; only nonzero elements are retained.
-
gvar.evalcorr(g)¶ Compute correlation matrix for elements of array/dictionary
g.If
gis an array ofgvar.GVars,evalcorrreturns the correlation matrix as an array with shapeg.shape+g.shape. Ifgis a dictionary whose values aregvar.GVars or arrays ofgvar.GVars, the result is a doubly-indexed dictionary wherecorr[k1,k2]is the correlation forg[k1]andg[k2].The correlation matrix is related to the covariance matrix by:
corr[i,j] = cov[i,j] / (cov[i,i] * cov[j,j]) ** 0.5
Return
Trueifgvar.GVars ing1uncorrelated with those ing2.g1andg2can begvar.GVars, arrays ofgvar.GVars, or dictionaries containinggvar.GVars or arrays ofgvar.GVars. ReturnsTrueif either ofg1org2isNone.
-
gvar.chi2(g1, g2, svdcut=1e-12)¶ Compute chi**2 of
g1-g2.chi**2is a measure of whether the multi-dimensional Gaussian distributionsg1andg2(dictionaries or arrays) agree with each other — that is, do their means agree within errors for corresponding elements. The probability is high ifchi2(g1,g2)/chi2.dofis of order 1 or smaller.Usually
g1andg2are dictionaries with the same keys, whereg1[k]andg2[k]aregvar.GVars or arrays ofgvar.GVars having the same shape. Alternativelyg1andg2can begvar.GVars, or arrays ofgvar.GVars having the same shape.One of
g1org2can contain numbers instead ofgvar.GVars, in which casechi**2is a measure of the likelihood that the numbers came from the distribution specified by the other argument.One or the other of
g1org2can be missing keys, or missing elements from arrays. Only the parts ofg1andg2that overlap are used. Also settingg2=Noneis equivalent to replacing its elements by zeros.chi**2is computed from the inverse of the covariance matrix ofg1-g2. The matrix inversion can be sensitive to roundoff errors. In such cases, SVD cuts can be applied by setting parameterssvdcut; see the documentation forgvar.svd(), which is used to apply the cut.The return value is the
chi**2. Extra attributes attached to this value give additional information:dof — Number of degrees of freedom (that is, the number of variables compared).
Q — The probability that the
chi**2could have been larger, by chance, even ifg1andg2agree. Values smaller than 0.1 or so suggest that they do not agree. Also called the p-value.
-
gvar.fmt_chi2(f)¶ Return string containing
chi**2/dof,dofandQfromf.Assumes
fhas attributeschi2,dofandQ. The logarithm of the Bayes factor will also be printed iffhas attributelogGBF.
gvar.GVars are compared by:
-
gvar.equivalent(g1, g2, rtol=1e-10, atol=1e-10)¶ Determine whether
g1andg2contain equivalentgvar.GVars.Compares sums and differences of
gvar.GVars stored ing1andg2to see if they agree within tolerances. Operationally, agreement means that:abs(diff) < abs(summ) / 2 * rtol + atol
where
diffandsummare the difference and sum of the mean values (g.mean) or derivatives (g.der) associated with each pair ofgvar.GVars.gvar.GVars that are equivalent are effectively interchangeable with respect to both their means and also their covariances with any othergvar.GVar(including ones not ing1andg2).g1andg2can be individualgvar.GVars or arrays ofgvar.GVars or dictionaries whose values aregvar.GVars and/or arrays ofgvar.GVars. Comparisons are made only for shared keys when they are dictionaries. Array dimensions must match betweeng1andg2, but the shapes can be different; comparisons are made for the parts of the arrays that overlap in shape.- Parameters
g1 – A
gvar.GVaror an array ofgvar.GVars or a dictionary ofgvar.GVars and/or arrays ofgvar.GVars.g2 – A
gvar.GVaror an array ofgvar.GVars or a dictionary ofgvar.GVars and/or arrays ofgvar.GVars.rtol – Relative tolerance with which mean values and derivatives must agree with each other. Default is
1e-10.atol – Absolute tolerance within which mean values and derivatives must agree with each other. Default is
1e-10.
gvar.GVars can be stored (serialized) and retrieved from files (or strings) using:
-
gvar.dump(g, outputfile, method='pickle')¶ Serialize a collection
gofgvar.GVars into fileoutputfile.The
gvar.GVars are recovered usinggvar.load().Three serialization methods are available:
pickle,json, andyaml(provided theyamlmodule is installed).jsoncan have trouble with dictionaries whose keys are not strings. A workaround is used here that succeeds providedeval(repr(k)) == kfor every keyk, which is true for strings and lots of other types of key. Usepicklewhere the workaround fails.- Parameters
g – A
gvar.GVar, array ofgvar.GVars, or dictionary whose values aregvar.GVars and/or arrays ofgvar.GVars.outputfile – The name of a file or a file object in which the serialized
gvar.GVars are stored.method (str) – Serialization method, which should be one of
['pickle', 'json', 'yaml']. Default is'pickle'.
-
gvar.dumps(g, method='pickle')¶ Serialize a collection
gofgvar.GVars into a string.The
gvar.GVars are recovered usinggvar.loads().Three serialization methods are available:
pickle,json, andyaml(provided theyamlmodule is installed).jsoncan have trouble with dictionaries whose keys are not strings. A workaround is used here that succeeds providedeval(repr(k)) == kfor every keyk, which is true for strings and lots of other types of key. Usepicklewhere the workaround fails.
-
gvar.load(inputfile, method=None)¶ Load and return serialized
gvar.GVars from fileinputfile.This function recovers
gvar.GVars pickled withgvar.dump().- Parameters
inputfile – The name of the file or a file object in which the serialized
gvar.GVars are stored.method (str or None) – Serialization method, which should be one of
['pickle', 'json', 'yaml']. Ifmethod=None, then each method is tried in turn.
- Returns
The reconstructed
gvar.GVar, or array or dictionary ofgvar.GVars.
-
gvar.loads(inputstring, method=None)¶ Load and return serialized
gvar.GVars from stringinputstring.This function recovers
gvar.GVars pickled withgvar.dumps().- Parameters
inputstring – A string containing
gvar.GVars serialized usinggvar.dumps().method (str or None) – Serialization method, which should be one of
['pickle', 'json', 'yaml']. Ifmethod=None, then each method is tried in turn.
- Returns
The reconstructed
gvar.GVar, or array or dictionary ofgvar.GVars.
-
gvar.disassemble(g)¶ Disassemble collection
gofgvar.GVars.Disassembles collection
gofgvar.GVars into components that can be pickled or otherwise stored. The output is reassembled bygvar.reassemble().
-
gvar.reassemble(data, cov=gvar.gvar.cov)¶ Convert data (from disassemble) back into
gvar.GVars.- Parameters
data (BufferDict, array) – Disassembled collection of
gvar.GVars fromgvar.disassemble()that are to be reassembled.cov (gvar.smat) – Covariance matrix corresponding to the
gvar.GVars indata. (Default isgvar.gvar.cov.)
gvar.GVars contain information about derivatives with respect to the independent
gvar.GVars from which they were constructed. This information can be extracted using:
-
gvar.deriv(g, x)¶ Compute first derivatives wrt
xofgvar.GVars ing.gcan be agvar.GVar, an array ofgvar.GVars, or a dictionary containinggvar.GVars or arrays ofgvar.GVars. Result has the same layout asg.xmust be an primarygvar.GVar, which is agvar.GVarcreated by a call togvar.gvar()(e.g.,x = gvar.gvar(xmean, xsdev)) or a functionf(x)of such agvar.GVar. (More precisely,x.dermust have only one nonzero entry.)
The following functions are used to generate random arrays or dictionaries
from the distribution defined by array (or dictionary) g of gvar.GVars.
The random numbers incorporate any correlations implied by the gs.
-
gvar.raniter(g, n=None, svdcut=1e-12)¶ Return iterator for random samples from distribution
gThe gaussian variables (
gvar.GVarobjects) in array (or dictionary)gcollectively define a multidimensional gaussian distribution. The iterator defined byraniter()generates an array (or dictionary) containing random numbers drawn from that distribution, with correlations intact.The layout for the result is the same as for
g. So an array of the same shape is returned ifgis an array. Whengis a dictionary, individual entriesg[k]may begvar.GVars or arrays ofgvar.GVars, with arbitrary shapes.raniter()also works whengis a singlegvar.GVar, in which case the resulting iterator returns random numbers drawn from the distribution specified byg.- Parameters
g – An array (or dictionary) of objects of type
gvar.GVar; or agvar.GVar.n (int or
None) – Maximum number of random iterations. Settingn=None(the default) implies there is no maximum number.svdcut (float or
None) – If positive, replace eigenvalueseigofg’s correlation matrix withmax(eig, svdcut * max_eig)wheremax_eigis the largest eigenvalue; if negative, discard eigenmodes with eigenvalues smaller than|svdcut| * max_eig. Default is1e-12.
- Returns
An iterator that returns random arrays or dictionaries with the same shape as
gdrawn from the Gaussian distribution defined byg.
-
gvar.sample(g, svdcut=1e-12)¶ Generate random sample from distribution
g.Equivalent to
next(gvar.raniter(g, svdcut=svdcut)).- Parameters
g – An array or dictionary of objects of type
gvar.GVar; or agvar.GVar.svdcut (float or
None) – If positive, replace eigenvalueseigofg’s correlation matrix withmax(eig, svdcut * max_eig)wheremax_eigis the largest eigenvalue; if negative, discard eigenmodes with eigenvalues smaller than|svdcut| * max_eig. Default is1e-12.
- Returns
A random array or dictionary, with the same shape as
g, drawn from the Gaussian distribution defined byg.
-
gvar.bootstrap_iter(g, n=None, svdcut=1e-12)¶ Return iterator for bootstrap copies of
g.The gaussian variables (
gvar.GVarobjects) in array (or dictionary)gcollectively define a multidimensional gaussian distribution. The iterator created bybootstrap_iter()generates an array (or dictionary) of newgvar.GVars whose covariance matrix is the same asg’s but whose means are drawn at random from the originalgdistribution. This is a bootstrap copy of the original distribution. Each iteration of the iterator has different means (but the same covariance matrix).bootstrap_iter()also works whengis a singlegvar.GVar, in which case the resulting iterator returns bootstrap copies of theg.- Parameters
g (array or dictionary or BufferDict) – An array (or dictionary) of objects of type
gvar.GVar.n – Maximum number of random iterations. Setting
n=None(the default) implies there is no maximum number.svdcut (
Noneor number) – If positive, replace eigenvalueseigofg’s correlation matrix withmax(eig, svdcut * max_eig)wheremax_eigis the largest eigenvalue; if negative, discard eigenmodes with eigenvalues smaller than|svdcut| * max_eig. Default is1e-12.
- Returns
An iterator that returns bootstrap copies of
g.
This function is used to seed the random number generator used by gvar:
-
gvar.ranseed(a)¶ Seed random number generators with tuple
seed.Argument
seedis an integer or atupleof integers that is used to seed the random number generators used bynumpyandrandom(and therefore bygvar). Reusing the sameseedresults in the same set of random numbers.ranseedgenerates its own seed when called without an argument or withseed=None. This seed is stored inranseed.seedand also returned by the function. The seed can be used to regenerate the same set of random numbers at a later time.- Parameters
seed (int, tuple, or None) – Seed for generator. Generates a random tuple if
None.- Returns
The seed used to reseed the generator.
The following two functions that are useful for tabulating results
and for analyzing where the errors in a gvar.GVar constructed from
other gvar.GVars come from:
-
gvar.fmt_errorbudget(outputs, inputs, ndecimal=2, percent=True, verify=False, colwidth=10)¶ Tabulate error budget for
outputs[ko]due toinputs[ki].For each output
outputs[ko],fmt_errorbudgetcomputes the contributions tooutputs[ko]’s standard deviation coming from thegvar.GVars collected ininputs[ki]. This is done for each key combination(ko,ki)and the results are tabulated with columns and rows labeled bykoandki, respectively. If agvar.GVarininputs[ki]is correlated with othergvar.GVars, the contribution from the others is included in thekicontribution as well (since contributions from correlatedgvar.GVars cannot be distinguished). The table is returned as a string.- Parameters
outputs – Dictionary of
gvar.GVars for which an error budget is computed.inputs – Dictionary of:
gvar.GVars, arrays/dictionaries ofgvar.GVars, or lists ofgvar.GVars and/or arrays/dictionaries ofgvar.GVars.fmt_errorbudgettabulates the parts of the standard deviations of eachoutputs[ko]due to eachinputs[ki].ndecimal (
int) – Number of decimal places displayed in table.percent (boolean) – Tabulate % errors if
percent is True; otherwise tabulate the errors themselves.colwidth (positive integer or None) – Width of each column. This is set automatically, to accommodate label widths, if
colwidth=None(default).verify (boolean) – If
True, a warning is issued if: 1) different inputs are correlated (and therefore double count errors); or 2) the sum (in quadrature) of partial errors is not equal to the total error to within 0.1% of the error (and the error budget is incomplete or overcomplete). No checking is done ifverify==False(default).
- Returns
A table (
str) containing the error budget. Output variables are labeled by the keys inoutputs(columns); sources of uncertainty are labeled by the keys ininputs(rows).
-
gvar.fmt_values(outputs, ndecimal=None)¶ Tabulate
gvar.GVars inoutputs.- Parameters
outputs – A dictionary of
gvar.GVarobjects.ndecimal (
intorNone) – Format valuesvusingv.fmt(ndecimal).
- Returns
A table (
str) containing values and standard deviations for variables inoutputs, labeled by the keys inoutputs.
The following function applies an SVD cut to the correlation matrix
of a set of gvar.GVars:
-
gvar.svd(g, svdcut=1e-12, wgts=False, add_svdnoise=False)¶ Apply SVD cuts to collection of
gvar.GVars ing.Standard usage is, for example,
svdcut = ... gmod = svd(g, svdcut=svdcut)
where
gis an array ofgvar.GVars or a dictionary containinggvar.GVars and/or arrays ofgvar.GVars. Whensvdcut>0,gmodis a copy ofgwhosegvar.GVars have been modified to make their correlation matrix less singular than that of the originalg: each eigenvalueeigof the correlation matrix is replaced bymax(eig, svdcut * max_eig)wheremax_eigis the largest eigenvalue. This SVD cut, which is applied separately to each block-diagonal sub-matrix of the correlation matrix, increases the variance of the eigenmodes with eigenvalues smaller thansvdcut * max_eig.The modification of
g’s covariance matrix is implemented by adding (tog) a set ofgvar.GVars with zero means:gmod = g + gmod.svdcorrection
where
gmod.svdcorrectionis an array/dictionary containing thegvar.GVars. If parameteradd_svdnoise=True, noise is included ingmod.svdcorrection,gmod.svdcorrection += gv.sample(gmod.svdcorrection),
before it is added to
g. The noise can be useful for testing fits and other applications.When
svdcutis negative, eigenmodes of the correlation matrix whose eigenvalues are smaller than|svdcut| * max_eigare dropped from the new matrix and the corresponding components ofgare zeroed out (that is, replaced by 0(0)) ingmod.There is an additional parameter
wgtsingvar.svd()whose default value isFalse. Settingwgts=1orwgts=-1instead causesgvar.svd()to return a tuple(gmod, i_wgts)wheregmodis the modified copy ofg, andi_wgtscontains a spectral decomposition of the covariance matrix corresponding to the modified correlation matrix ifwgts=1, or a decomposition of its inverse ifwgts=-1. The first entryi, wgts = i_wgts[0]specifies the diagonal part of the matrix:iis a list of the indices ingmod.flatcorresponding to diagonal elements, andwgts ** 2gives the corresponding matrix elements. The second and subsequent entries,i, wgts = i_wgts[n]forn > 0, each correspond to block-diagonal sub-matrices, whereiis the list of indices corresponding to the block, andwgts[j]are eigenvectors of the sub-matrix rescaled so thatnumpy.sum(numpy.outer(wi, wi) for wi in wgts[j]
is the sub-matrix (
wgts=1) or its inverse (wgts=-1).To compute the inverse of the covariance matrix from
i_wgts, for example, one could use code like:gmod, i_wgts = svd(g, svdcut=svdcut, wgts=-1) inv_cov = numpy.zeros((n, n), float) i, wgts = i_wgts[0] # 1x1 sub-matrices if len(i) > 0: inv_cov[i, i] = numpy.array(wgts) ** 2 for i, wgts in i_wgts[1:]: # nxn sub-matrices (n>1) for w in wgts: inv_cov[i[:, None], i] += numpy.outer(w, w)
This sets
inv_covequal to the inverse of the covariance matrix of thegmods. Similarly, we can compute the expectation value,u.dot(inv_cov.dot(v)), between two vectors (numpyarrays) using:result = 0.0 i, wgts = i_wgts[0] # 1x1 sub-matrices if len(i) > 0: result += numpy.sum((u[i] * wgts) * (v[i] * wgts)) for i, wgts in i_wgts[1:]: # nxn sub-matrices (n>1) result += numpy.sum(wgts.dot(u[i]) * wgts.dot(v[i]))
where
resultis the desired expectation value.- Parameters
g – An array of
gvar.GVars or a dicitionary whose values aregvar.GVars and/or arrays ofgvar.GVars.svdcut (None or float) – If positive, replace eigenvalues
eigof the correlation matrix withmax(eig, svdcut * max_eig)wheremax_eigis the largest eigenvalue; if negative, discard eigenmodes with eigenvalues smaller than|svdcut| * max_eig. Note|svdcut| < 1. Default is 1e-12.wgts – Setting
wgts=1causesgvar.svd()to compute and return a spectral decomposition of the covariance matrix of the modifiedgvar.GVars,gmod. Settingwgts=-1results in a decomposition of the inverse of the covariance matrix. The default value isFalse, in which case onlygmodis returned.add_svdnoise – If
True, noise is added to the SVD correction (see above).
- Returns
A copy
gmodofgwhose correlation matrix is modified by SVD cuts. Ifwgtsis notFalse, a tuple(g, i_wgts)is returned wherei_wgtscontains a spectral decomposition ofgmod’s covariance matrix or its inverse.
Data from the SVD analysis is stored in
gmod:-
gmod.svdcut¶ SVD cut used to create
gmod.
-
gmod.dof¶ Number of independent degrees of freedom left after the SVD cut. This is the same as the number initially unless
svdcut < 0in which case it may be smaller.
-
gmod.nmod¶ Number of modes whose eignevalue was modified by the SVD cut.
-
gmod.nblocks¶ A dictionary where
gmod.nblocks[s]contains the number of block-diagonals-by-ssub-matrices in the correlation matrix.
-
gmod.eigen_range¶ Ratio of the smallest to largest eigenvalue before SVD cuts are applied (but after rescaling).
-
gmod.logdet¶ Logarithm of the determinant of the covariance matrix after SVD cuts are applied (excluding any omitted modes when
svdcut < 0and any diagonal zero modes).
-
gmod.svdcorrection¶ Array or dictionary containing the SVD corrections added to
gto creategmod:gmod = g + gmod.svdcorrection.
This function is useful when the correlation matrix is singular or almost singular, and its inverse is needed (as in curve fitting).
The following function can be used to rebuild collections of gvar.GVars,
ignoring all correlations with other variables. It can also be used to
introduce correlations between uncorrelated variables.
-
gvar.rebuild(g, gvar=gvar, corr=0.0)¶ Rebuild
gstripping correlations with variables not ing.gis either an array ofgvar.GVars or a dictionary containinggvar.GVars and/or arrays ofgvar.GVars.rebuild(g)creates a new collectiongvar.GVars with the same layout, means and covariance matrix as those ing, but discarding all correlations with variables not ing.If
corris nonzero,rebuildwill introduce correlations wherever there aren’t any usingcov[i,j] -> corr * sqrt(cov[i,i]*cov[j,j])
wherever
cov[i,j]==0.0initially. Positive values forcorrintroduce positive correlations, negative values anti-correlations.Parameter
gvarspecifies a function for creating newgvar.GVars that replacesgvar.gvar()(the default).- Parameters
g (array or dictionary) –
gvar.GVars to be rebuilt.gvar (
gvar.GVarFactoryorNone) – Replacement forgvar.gvar()to use in rebuilding. Default isgvar.gvar().corr (number) – Size of correlations to introduce where none exist initially.
- Returns
Array or dictionary (gvar.BufferDict) of
gvar.GVars (same layout asg) where all correlations with variables other than those ingare erased.
The following functions creates new functions that generate gvar.GVars (to
replace gvar.gvar()):
-
gvar.switch_gvar()¶ Switch
gvar.gvar()to newgvar.GVarFactory.- Returns
New
gvar.gvar().
-
gvar.restore_gvar()¶ Restore previous
gvar.gvar().- Returns
Previous
gvar.gvar().
-
gvar.gvar_factory(cov=None)¶ Return new function for creating
gvar.GVars (to replacegvar.gvar()).If
covis specified, it is used as the covariance matrix for newgvar.GVars created by the function returned bygvar_factory(cov). Otherwise a new covariance matrix is created internally.
gvar.GVars created by different functions cannot be combined in arithmetic
expressions (the error message “Incompatible GVars.” results).
gvar.GVar Objects¶
The fundamental class for representing Gaussian variables is:
-
class
gvar.GVar¶ The basic attributes are:
-
mean¶ Mean value.
-
sdev¶ Standard deviation.
-
var¶ Variance.
Two methods allow one to isolate the contributions to the variance or standard deviation coming from other
gvar.GVars:-
partialvar(*args)¶ Compute partial variance due to
gvar.GVars inargs.This method computes the part of
self.vardue to thegvar.GVars inargs. Ifargs[i]is correlated with othergvar.GVars, the variance coming from these is included in the result as well. (This last convention is necessary because variances associated with correlatedgvar.GVars cannot be disentangled into contributions corresponding to each variable separately.)
-
partialsdev(*args)¶ Compute partial standard deviation due to
gvar.GVars inargs.This method computes the part of
self.sdevdue to thegvar.GVars inargs. Ifargs[i]is correlated with othergvar.GVars, the standard deviation coming from these is included in the result as well. (This last convention is necessary because variances associated with correlatedgvar.GVars cannot be disentangled into contributions corresponding to each variable separately.)
Partial derivatives of the
gvar.GVarwith respect to the independentgvar.GVars from which it was constructed are given by:-
deriv(x)¶ Derivative of
selfwith respest to primarygvar.GVarx.All
gvar.GVars are constructed from primarygvar.GVars.self.deriv(x)returns the partial derivative ofselfwith respect to primarygvar.GVarx, holding all of the other primarygvar.GVars constant.
There are two methods for converting
selfinto a string, for printing:-
__str__()¶ Return string representation of
self.The representation is designed to show at least one digit of the mean and two digits of the standard deviation. For cases where mean and standard deviation are not too different in magnitude, the representation is of the form
'mean(sdev)'. When this is not possible, the string has the form'mean +- sdev'.
-
fmt(ndecimal=None, sep='')¶ Convert to string with format:
mean(sdev).Leading zeros in the standard deviation are omitted: for example,
25.67 +- 0.02becomes25.67(2). Parameterndecimalspecifies how many digits follow the decimal point in the mean. Parametersepis a string that is inserted between themeanand the(sdev). IfndecimalisNone(default), it is set automatically to the larger ofint(2-log10(self.sdev))or0; this will display at least two digits of error. Very large or very small numbers are written with exponential notation whenndecimalisNone.Setting
ndecimal < 0returnsmean +- sdev.
Two attributes and a method make reference to the original variables from which
selfis derived:-
dotder(v)¶ Return the dot product of
self.derandv.
-
gvar.BufferDict Objects¶
gvar.BufferDict objects are ordered dictionaries that are heavily used
in gvar’s implementation. They provide the most flexible
representation for multi-dimensional Gaussian distributions.
gvar.BufferDict objects differ from ordinary dictionaries in two respects. The
first difference is that the dictionary’s values must be scalars or numpy
arrays (any shape) of scalars. The scalars can be ordinary integers or floats,
but the dictionary was designed especially for gvar.GVars.
The dictionary’s values are packed into different parts of a single
one-dimensional array or buffer. Items can be added one at a time as in other
dictionaries,
>>> import gvar as gv
>>> b = gv.BufferDict()
>>> b['s'] = 0.0
>>> b['v'] = [1., 2.]
>>> b['t'] = [[3., 4.], [5., 6.]]
>>> print(b)
{'s': 0.0,'v': array([1., 2.]),'t': array([[3., 4.],
[5., 6.]])}
but the values can also be accessed all at once through the buffer:
>>> print(b.buf)
[0. 1. 2. 3. 4. 5. 6.]
The size, shape, and type of the data associated with a given key cannot be
changed by later assignments. For example, here b['s'] = [10., 20.] would
generate a ValueError exception. The actual value can, of course,
be modified: b['s'] = 22. is fine.
The second difference between gvar.BufferDicts and other dictionaries is
illustrated by the following code:
>>> b = gv.BufferDict()
>>> b['log(a)'] = gv.gvar('1(1)')
>>> print(b)
{'log(a)': 1.0(1.0)}
>>> print(b['a'], b['log(a)'])
2.7(2.7) 1.0(1.0)
Even though 'a' is not a key in the dictionary, b['a'] is still
defined: it equals exp(b['log(a)']). This feature is used to provide
(limited) support for non-Gaussian distributions. Here gvar.BufferDict b
specifies a distribution that is Gaussain for p['log(a)'], and
therefore log-normal for p['a']. Thus, for example,
>>> [x['a'] for x in gv.raniter(b, n=4)]
[2.1662650927997817, 2.3350022125310317, 8.732161128765775, 3.578188553455522]
creates a list of four random numbers drawn from a log-normal distribution.
Note that 'a' in this
example is not a key in dictionary b, even though both b['a']
and b.get('a') return values:
>>> print('a' in b)
False
>>> print('log(a)' in b)
True
>>> print(list(b))
['log(a)']
>>> print(b['a'], b.get('a'))
2.7(2.7) 2.7(2.7)
This functionality is used routinely by other modules (e.g., lsqfit).
The supported distributions, and methods for adding new ones, are
described in the documentation for gvar.BufferDict.add_distribution(),
below.
-
class
gvar.BufferDict¶ Ordered dictionary whose data are packed into a 1-d buffer.
gvar.BufferDicts can be created in the usual way dictionaries are created:>>> b = BufferDict() >>> b['a'] = 1. >>> b['b'] = 2 >>> print(b) {'a': 1.0,'b':2.0} >>> b = BufferDict(a=1., b=2.) >>> b = BufferDict([('a',1.), ('b',2.)])
They can also be created from other dictionaries or
gvar.BufferDicts:>>> c = BufferDict(b) >>> print(c) {'a': 1.0,'b': 2.0} >>> c = BufferDict(b, keys=['b']) >>> print(c) {'b': 2.0}
The values in a
gvar.BufferDictare scalars or arrays of a scalar type (gvar.GVar,float,int, etc.). The data type is normally inferred (dynamically) from the data itself, but can be specified when creating thegvar.BufferDictfrom another dictionary or list, using keyworddtype:>>> b = BufferDict(dict(a=1.2), dtype=int) >>> print(b, b.dtype) {'a': 1} int64
Some simple arithemetic is allowed between two
gvar.BufferDicts, say,g1andg2provided they have the same keys and array shapes. So, for example:>>> a = BufferDict(a=1., b=[2., 3.]) >>> b = BufferDict(a=10., b=[20., 30.]) >>> print(a + b) {'a': 11.0,'b': array([22., 33.])}
Subtraction is also defined, as are multiplication and division by scalars. The corresponding
+=,-=,*=,/=operators are supported, as are unary+and-.Finally a
gvar.BufferDictcan be cloned from another one but with a different buffer (containing different values):>>> b = BufferDict(a=1., b=2.) >>> c = BufferDict(b, buf=[10, 20]) >>> print(c) {'a': 10,'b': 20}
The main attributes are:
-
size¶ Size of buffer array.
-
flat¶ Buffer array iterator.
-
dtype¶ Data type of buffer array elements.
-
buf¶ The (1d) buffer array. Allows direct access to the buffer: for example,
self.buf[i] = new_valsets the value of thei-thelement in the buffer to valuenew_val. Settingself.buf = nbufreplaces the old buffer by new buffernbuf. This only works ifnbufis a one-dimensionalnumpyarray having the same length as the old buffer, sincenbufitself is used as the new buffer (not a copy).
-
shape¶ Always equal to
None. This attribute is included sincegvar.BufferDicts share several attributes withnumpyarrays to simplify coding that might support either type. Being dictionaries they do not have shapes in the sense ofnumpyarrays (hence the shape isNone).
In addition to standard dictionary methods, the main methods here are:
-
flatten()¶ Copy of buffer array.
-
slice(k)¶ Return slice/index in
self.flatcorresponding to keyk.
-
slice_shape(k)¶ Return tuple
(slice/index, shape)corresponding to keyk.
-
has_dictkey(k)¶ Returns
Trueifself[k]is defined;Falseotherwise.Note that
kmay be a key or it may be related to a related key associated with a non-Gaussian distribution (e.g.,'log(k)'; seegvar.BufferDict.add_distribution()for more information).
-
static
add_distribution(name, invfcn)¶ Add new parameter distribution.
gvar.BufferDicts can be used to represent a restricted class of non-Gaussian distributions. For example, the codeimport gvar as gv gv.BufferDict.add_distribution('log', gv.exp)
enables the use of log-normal distributions for parameters. So defining, for example,
b = gv.BufferDict() b['log(a)'] = gv.gvar('1(1)')
means that
b['a']has a value (equal toexp(b['log(a)']) even though'a'is not a key in the dictionary.The distributions available by default correspond to:
gv.BufferDict.add_distribution('log', gv.exp) gv.BufferDict.add_distribution('sqrt', gv.square) gv.BufferDict.add_distribution('erfinv', gv.erf)
Additional distributions can be added by specifying:
- Parameters
name (str) – Distributions’ function name.
invfcn (callable) – Inverse of the transformation function.
-
static
del_distribution(invfcn)¶ Delete
gvar.BufferDictdistributionname.
-
gvar.SVD Objects¶
SVD analysis is handled by the following class:
-
class
gvar.SVD(mat, svdcut=None, svdnum=None, compute_delta=False, rescale=False)¶ SVD decomposition of a pos. sym. matrix.
SVDis a function-class that computes the eigenvalues and eigenvectors of a positive symmetric matrixmat. Eigenvalues that are small (or negative, because of roundoff) can be eliminated or modified using svd cuts. Typical usage is:>>> mat = [[1.,.25],[.25,2.]] >>> s = SVD(mat) >>> print(s.val) # eigenvalues [ 0.94098301 2.05901699] >>> print(s.vec[0]) # 1st eigenvector (for s.val[0]) [ 0.97324899 -0.22975292] >>> print(s.vec[1]) # 2nd eigenvector (for s.val[1]) [ 0.22975292 0.97324899] >>> s = SVD(mat,svdcut=0.6) # force s.val[i]>=s.val[-1]*0.6 >>> print(s.val) [ 1.2354102 2.05901699] >>> print(s.vec[0]) # eigenvector unchanged [ 0.97324899 -0.22975292] >>> s = SVD(mat) >>> w = s.decomp(-1) # decomposition of inverse of mat >>> invmat = sum(numpy.outer(wj,wj) for wj in w) >>> print(numpy.dot(mat,invmat)) # should be unit matrix [[ 1.00000000e+00 2.77555756e-17] [ 1.66533454e-16 1.00000000e+00]]
Input parameters are:
- Parameters
mat (2-d sequence (
numpy.arrayorlistor …)) – Positive, symmetric matrix.svdcut (
Noneor number(|svdcut|<=1).) – If positive, replace eigenvalues ofmatwithsvdcut*(max eigenvalue); if negative, discard eigenmodes with eigenvalues smaller thansvdcuttimes the maximum eigenvalue.svdnum (
Noneor int) – If positive, keep only the modes with the largestsvdnumeigenvalues; ignore if set toNone.compute_delta (boolean) – Compute
delta(see below) ifTrue; setdelta=Noneotherwise.rescale – Rescale the input matrix to make its diagonal elements equal to +-1.0 before diagonalizing.
The results are accessed using:
-
val¶ An ordered array containing the eigenvalues of
mat. Note thatval[i]<=val[i+1].
-
vec¶ Eigenvectors
vec[i]corresponding to the eigenvaluesval[i].
-
valmin¶ Minimum eigenvalue allowed in the modified matrix.
-
valorig¶ Eigenvalues of original matrix.
-
D¶ The diagonal matrix used to precondition the input matrix if
rescale==True. The matrix diagonalized isD M DwhereMis the input matrix.Dis stored as a one-dimensional vector of diagonal elements.DisNoneifrescale==False.
-
nmod¶ The first
nmodeigenvalues inself.valwere modified by the SVD cut (equals 0 unlesssvdcut > 0).
-
eigen_range¶ Ratio of the smallest to the largest eigenvector in the unconditioned matrix (after rescaling if
rescale=True)
-
delta¶ A vector of
gvars whose means are zero and whose covariance matrix is what was added tomatto condition its eigenvalues. IsNoneifsvdcut<0orcompute_delta==False.
-
decomp(n)¶ Vector decomposition of input matrix raised to power
n.Computes vectors
w[i]such thatmat**n = sum_i numpy.outer(w[i],w[i])
where
matis the original input matrix tosvd. This decomposition cannot be computed if the input matrix was rescaled (rescale=True) except forn=1andn=-1.- Parameters
n (number) – Power of input matrix.
- Returns
Array
wof vectors.