Conference paper Open Access

Unsupervised Anomaly Detection in Data Quality Control

Poon, Lex; Farshidi, Siamak; Li, Na; Zhao, Zhiming

Citation Style Language JSON Export

  "DOI": "10.1109/BigData52589.2021.9671672", 
  "title": "Unsupervised Anomaly Detection in Data Quality Control", 
  "issued": {
    "date-parts": [
  "abstract": "<p>Data is one of the most valuable assets of an</p>\n\n<p>organization and has a tremendous impact on its long-term</p>\n\n<p>success and decision-making processes. Typically, organizational</p>\n\n<p>data error and outlier detection processes perform manually and</p>\n\n<p>reactively, making them time-consuming and prone to human errors.</p>\n\n<p>Additionally, rich data types, unlabeled data, and increased</p>\n\n<p>volume have made such data more complex. Accordingly, an</p>\n\n<p>automated anomaly detection approach is required to improve</p>\n\n<p>data management and quality control processes. This study</p>\n\n<p>introduces an unsupervised anomaly detection approach based</p>\n\n<p>on models comparison, consensus learning, and a combination of</p>\n\n<p>rules of thumb with iterative hyper-parameter tuning to increase</p>\n\n<p>data quality. Furthermore, a domain expert is considered a</p>\n\n<p>human in the loop to evaluate and check the data quality and to</p>\n\n<p>judge the output of the unsupervised model. An experiment has</p>\n\n<p>been conducted to assess the proposed approach in the context of</p>\n\n<p>a case study. The experiment results confirm that the proposed</p>\n\n<p>approach can improve the quality of</p>", 
  "author": [
      "family": "Poon, Lex"
      "family": "Farshidi, Siamak"
      "family": "Li, Na"
      "family": "Zhao, Zhiming"
  "id": "5872438", 
  "event-place": "Virtual", 
  "version": "camera ready", 
  "type": "paper-conference", 
  "event": "7th International Workshop on Methods to Improve Big Data Science Projects (MIDP-2021), in IEEE BigData 2021 (MIDP-2021)"
Views 35
Downloads 46
Data volume 133.4 MB
Unique views 28
Unique downloads 45


Cite as