Published February 25, 2023 | Version v1
Dataset Open

bnlearn datasets

Description

A collection of various structure learning datasets from the Bayesian Network Repository with description files.

Size: 5 simple datasets 

Number of features: 3 - 56

Ground truth: Yes

Type of Graph: Directed Graph

The alarm dataset contains the following 37 variables:
CVP (central venous pressure): a three-level factor with levels LOW, NORMAL and HIGH.

PCWP (pulmonary capillary wedge pressure): a three-level factor with levels LOW, NORMAL and HIGH.

HIST (history): a two-level factor with levels TRUE and FALSE.

TPR (total peripheral resistance): a three-level factor with levels LOW, NORMAL and HIGH.

... (33 more variables, see the corresponding .html file)

The asia dataset contains the following variables:
D (dyspnoea), a two-level factor with levels yes and no.

T (tuberculosis), a two-level factor with levels yes and no.

L (lung cancer), a two-level factor with levels yes and no.

B (bronchitis), a two-level factor with levels yes and no.

A(visit to Asia), a two-level factor with levels yes and no.

S (smoking), a two-level factor with levels yes and no.

X (chest X-ray), a two-level factor with levels yes and no.

E (tuberculosis versus lung cancer/bronchitis), a two-level factor with levels yes and no.

The coronary dataset contains the following 6 variables:
Smoking (smoking): a two-level factor with levels no and yes.

M. Work (strenuous mental work): a two-level factor with levels no and yes.

P. Work (strenuous physical work): a two-level factor with levels no and yes.

Pressure (systolic blood pressure): a two-level factor with levels <140 and >140.

Proteins (ratio of beta and alpha lipoproteins): a two-level factor with levels <3 and >3.

Family (family anamnesis of coronary heart disease): a two-level factor with levels neg and pos.

The hailfinder dataset contains the following 56 variables:

N07muVerMo (10.7mu vertical motion): a four-level factor with levels StrongUp, WeakUp, Neutral and Down.

SubjVertMo (subjective judgment of vertical motion): a four-level factor with levels StrongUp, WeakUp, Neutral and Down.

QGVertMotion (quasigeostrophic vertical motion): a four-level factor with levels StrongUp, WeakUp, Neutral and Down.

CombVerMo (combined vertical motion): a four-level factor with levels StrongUp, WeakUp, Neutral and Down.

AreaMesoALS (area of meso-alpha): a four-level factor with levels StrongUp, WeakUp, Neutral and Down.

SatContMoist (satellite contribution to moisture): a four-level factor with levels VeryWet, Wet, Neutral and Dry.

... (49 more variables are in the correspondent .html file)

The lizards dataset contains the following 3 variables:

Species (the species of the lizard): a two-level factor with levels Sagrei and Distichus.

Height (perch height): a two-level factor with levels high (greater than 4.75 feet) and low (lesser or equal to 4.75 feet).

Diameter (perch diameter): a two-level factor with levels narrow (greater than 4 inches) and wide (lesser or equal to 4 inches).

More information about the datasets is contained in the dataset_description.html files.

 

Files

bnlearn_data.zip

Files (2.1 MB)

Name Size Download all
md5:f123ea701227cfd8a43996183b7c5279
2.1 MB Preview Download

Additional details

Related works

Is documented by
Book chapter: 10.1007/978-1-4757-3502-4_6 (DOI)

References

  • Elidan, G. Bayesian Network Repository. (2001), https://www.cs.huji.ac.il/w~galel/Repository/
  • Beinlich I, Suermondt HJ, Chavez RM, Cooper GF (1989). "The ALARM Monitoring System: A Case Study with Two Probabilistic Inference Techniques for Belief Networks". Proceedings of the 2nd European Conference on Artificial Intelligence in Medicine, 247–256.
  • Scutari M (2010). "Learning Bayesian Networks with the bnlearn R Package." Journal of Statistical Software, 35(3), 1–22. doi:10.18637/jss.v035.i03