Published January 1988
| Version v2
Dataset
Open
Asia Lung Diseases
Authors/Creators
Description
This synthetic datset is about lung diseases and visits to Asia. It was introduced in Lauritzen and Spiegelhalter (1988).
Task: The dataset can be used to study causal discovery algorithms.
Summary:
- Size of dataset: 5,000 x 6
- Task: Causal Discovery Problem
- Data Type: Binary Data
- Dataset Scope: Standalone Dataset
- Ground Truth: Known Graph
- Temporal Structure: Static Data
- License: CC0 (generated for bnlearn)
- Missing Values: No Missing Values
Missingness Statement: There are no missing values.
Features:
- D: Dyspnoea (yes / no)
- T: Tuberculosis (yes / no)
- L: Lung cancer (yes / no)
- B: Bronchitis (yes / no)
- A: Visit to Asia (yes / no)
- S: Smoking (yes / no)
- X: Chest X-ray (yes / no)
- E: Tuberculosis versus lung cancer/bronchitis (yes / no)
Files:
- asia.csv: dataset
- ground_truth.csv: DAG used for data generation (Lauritzen and Spiegelhalter (1988)).
- asia.bif: Bayesian Network from (Scutari (2010), License CC BY-SA 3.0). The network was used for data generation in Lauritzen and Spiegelhalter (1988).
Files
asia.csv
Additional details
Related works
- Is documented by
- 10.1007/978-1-4757-3502-4_6 (DOI)
References
- Lauritzen S, Spiegelhalter D (1988). "Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion)." Journal of the Royal Statistical Society: Series B, 50(2):157–224.
- Scutari M (2010). "Learning Bayesian Networks with the bnlearn R Package." Journal of Statistical Software, 35(3), 1–22. doi:10.18637/jss.v035.i03