Dataset Open Access

Synthetic Dataset for Outlier Detection

Koncar, Philipp

This synthetically generated dataset can be used to evaluate outlier detection algorithms. It has 10 attributes and 1000 observations, of which 100 are labeled as outliers. Two-dimensional combinations of attributes form differently shaped clusters.

  • Attribute 0 & Attribute 1: Two circular clusters
  • Attribute 2 & Attribute 3: Two banana shaped clusters
  • Attribute 4 & Attribute 5: Three point clouds
  • Attribute 6 & Attribute 7: Two point clouds with variances
  • Attribute 8 & Attribute 9: Three anisotropic shaped clusters. 

The "outlier" column states whether an observation is an outlier or not. Additionally, the .zip file contains 10 stratified randomized train test splits (70% train, 30% test).

Files (1.0 MB)
Name Size
synthetic.zip
md5:c49cec605364d14f323addd11d649733
1.0 MB Download
1,934
275
views
downloads
All versions This version
Views 1,9341,935
Downloads 275274
Data volume 277.8 MB276.8 MB
Unique views 1,8431,844
Unique downloads 267266

Share

Cite as