Conference paper Open Access

Unsupervised Anomaly Detection in Data Quality Control

Poon, Lex; Farshidi, Siamak; Li, Na; Zhao, Zhiming


JSON-LD (schema.org) Export

{
  "description": "<p>Data is one of the most valuable assets of an</p>\n\n<p>organization and has a tremendous impact on its long-term</p>\n\n<p>success and decision-making processes. Typically, organizational</p>\n\n<p>data error and outlier detection processes perform manually and</p>\n\n<p>reactively, making them time-consuming and prone to human errors.</p>\n\n<p>Additionally, rich data types, unlabeled data, and increased</p>\n\n<p>volume have made such data more complex. Accordingly, an</p>\n\n<p>automated anomaly detection approach is required to improve</p>\n\n<p>data management and quality control processes. This study</p>\n\n<p>introduces an unsupervised anomaly detection approach based</p>\n\n<p>on models comparison, consensus learning, and a combination of</p>\n\n<p>rules of thumb with iterative hyper-parameter tuning to increase</p>\n\n<p>data quality. Furthermore, a domain expert is considered a</p>\n\n<p>human in the loop to evaluate and check the data quality and to</p>\n\n<p>judge the output of the unsupervised model. An experiment has</p>\n\n<p>been conducted to assess the proposed approach in the context of</p>\n\n<p>a case study. The experiment results confirm that the proposed</p>\n\n<p>approach can improve the quality of</p>", 
  "license": "https://creativecommons.org/licenses/by/4.0/legalcode", 
  "creator": [
    {
      "affiliation": "University of Amsterdam", 
      "@type": "Person", 
      "name": "Poon, Lex"
    }, 
    {
      "affiliation": "University of Amsterdam", 
      "@type": "Person", 
      "name": "Farshidi, Siamak"
    }, 
    {
      "affiliation": "University of Amsterdam", 
      "@type": "Person", 
      "name": "Li, Na"
    }, 
    {
      "affiliation": "University of Amsterdam", 
      "@id": "https://orcid.org/0000-0002-6717-9418", 
      "@type": "Person", 
      "name": "Zhao, Zhiming"
    }
  ], 
  "headline": "Unsupervised Anomaly Detection in Data Quality Control", 
  "image": "https://zenodo.org/static/img/logos/zenodo-gradient-round.svg", 
  "datePublished": "2021-12-15", 
  "url": "https://zenodo.org/record/5872438", 
  "version": "camera ready", 
  "@type": "ScholarlyArticle", 
  "keywords": [
    "data quality", 
    "unsupervised learning", 
    "data quality control", 
    "data quality assessment", 
    "anomaly detection,", 
    "automated data quality control"
  ], 
  "@context": "https://schema.org/", 
  "identifier": "https://doi.org/10.1109/BigData52589.2021.9671672", 
  "@id": "https://doi.org/10.1109/BigData52589.2021.9671672", 
  "workFeatured": {
    "url": "http://www.midp-info.org/", 
    "alternateName": "MIDP-2021", 
    "location": "Virtual", 
    "@type": "Event", 
    "name": "7th International Workshop on Methods to Improve Big Data Science Projects (MIDP-2021), in IEEE BigData 2021"
  }, 
  "name": "Unsupervised Anomaly Detection in Data Quality Control"
}
35
46
views
downloads
Views 35
Downloads 46
Data volume 133.4 MB
Unique views 28
Unique downloads 45

Share

Cite as