VineCopulas: an open-source Python package for vine copula modelling
Creators
Description
A copula method can be used to describe the dependency structure between several random
variables. Copula methods are used widely in various research fields across different disciplines,
ranging from finance to the bio-geophysical sciences (Dißmann et al., 2013; Klein et al.,
2020; Mitskopoulos et al., 2022). While some other multivariate distributions, for instance a
multivariate normal distribution, allow for a highly symmetric dependency structure with the
same univariate and multivariate marginal distributions, copulas can model the joint distribution
of multiple random variables separately from their marginal distribution (Czado & Nagler,
2021; Sklar, 1959).
Once a copula distribution has been modelled, they allow for random samples of the data to
be generated, as well as conditional samples. For example, if a copula has been fit between
people’s height and weight, this copula can create random correlated samples of both variables
as well as conditional samples, e.g., samples of weight given a specific height.
Although copulas are an excellent tool to model dependencies in bivariate data, data with two
variables, there are only a limited number of copulas capable of modelling larger multivariate
datasets, for example, the Gaussian and Student-t copula. However, when modelling the
dependencies between a large number of different variables, a more flexible multivariate
modelling tool may be required that does not assume a single copula to capture all the
individual dependencies. To this end, vine copulas have been proposed as a method to
construct a multivariate model with the use of bivariate copulas as building blocks (Aas et al.,
2009; Bedford & Cooke, 2001, 2002; Joe, 1997).
In the previous example related to height and weight, a vine copula could be used to also model
age in relation to height and weight. Like bivariate copulas, vine copulas allow the user to
generate random and conditional samples (Cooke et al., 2015). However, to draw conditional
samples from a vine copula for a specific variable, the vine copula has to be structured in such
a way that the order in which the samples are generated draws the variable of interest last,
i.e. the sample is conditioned on the preceding samples of other variables. For example, if one
wants to generate a conditional sample of height, the samples of age and weight have to be
provided first. Additionally, while it is more common to use copulas for continuous data, such
as weight and height, methods have been developed to also allow for discrete data, such as
age, to be modelled (Mitskopoulos et al., 2022).
VineCopulas is a Python package that is able to fit and simulate both bivariate and vine
copulas. This package allows for both discrete as well as continuous input data, and can draw
conditional samples for any variables of interest with the use of different vine structures.
Files
10.21105.joss.06728.pdf
Files
(395.4 kB)
Name | Size | Download all |
---|---|---|
md5:ec54861ab93b55a57a579e2b6f4e5777
|
395.4 kB | Preview Download |
Additional details
Funding
Software
- Repository URL
- https://zenodo.org/records/13121560
- Programming language
- Python