Sampling and datasets

Sampling of data is based on forward sampling from joint distribution of the bayesian network. In order to do that, it requires as input a DAG connected with CPDs. It is also possible to create a DAG manually (see create DAG section) or load an existing one as depicted below.

Datasets

Various DAGs available that can be easily loaded:

import bnlearn as bn

# The following models can be loaded:
loadfile = 'sprinkler'
loadfile = 'alarm'
loadfile = 'andes'
loadfile = 'asia'
loadfile = 'pathfinder'
loadfile = 'sachs'
loadfile = 'miserables'

DAG = bn.import_DAG(loadfile)

Models are usually stored in bif files which can also be imported:

filepath = 'directory/to/model.bif'

DAG = bn.import_DAG(filepath)

Example Sampling

# Import library
import bnlearn as bn

# Load example DAG with CPD
model = bn.import_DAG('sprinkler', CPD=True)

# Take 1000 samples from the CPD distribution
df = bn.sampling(model, n=1000)

df.head()

Cloudy

Sprinkler

Rain

Wet_Grass

0

1

0

1

1

1

1

1

1

0

1

1

0

0

0

0

1

0

0

0

1

0

1

1