Multi-Domain Outlier Detection Dataset
Creators
- 1. University of Maryland College Park
- 2. Jet Propulsion Laboratory, California Institute of Technology
Description
The Multi-Domain Outlier Detection Dataset contains datasets for conducting outlier detection experiments for four different application domains:
- Astrophysics - detecting anomalous observations in the Dark Energy Survey (DES) catalog (data type: feature vectors)
- Planetary science - selecting novel geologic targets for follow-up observation onboard the Mars Science Laboratory (MSL) rover (data type: grayscale images)
- Earth science: detecting anomalous samples in satellite time series corresponding to ground-truth observations of maize crops (data type: time series/feature vectors)
- Fashion-MNIST/MNIST: benchmark task to detect anomalous MNIST images among Fashion-MNIST images (data type: grayscale images)
Each dataset contains a "fit" dataset (used for fitting or training outlier detection models), a "score" dataset (used for scoring samples used to evaluate model performance, analogous to test set), and a label dataset (indicates whether samples in the score dataset are considered outliers or not in the domain of each dataset).
To read more about the datasets and how they are used for outlier detection, or to cite this dataset in your own work, please see the following citation:
Kerner, H. R., Rebbapragada, U., Wagstaff, K. L., Lu, S., Dubayah, B., Huff, E., Lee, J., Raman, V., and Kulshrestha, S. (2022). Domain-agnostic Outlier Ranking Algorithms (DORA)-A Configurable Pipeline for Facilitating Outlier Detection in Scientific Datasets. Under review for Frontiers in Astronomy and Space Sciences.
Files
all_datasets_dora.zip
Files
(56.5 MB)
Name | Size | Download all |
---|---|---|
md5:173ce0631377e902a1755c823e5367a3
|
56.5 MB | Preview Download |