Unsupervised Anomaly Detection in Data Quality Control
Description
Data is one of the most valuable assets of an
organization and has a tremendous impact on its long-term
success and decision-making processes. Typically, organizational
data error and outlier detection processes perform manually and
reactively, making them time-consuming and prone to human errors.
Additionally, rich data types, unlabeled data, and increased
volume have made such data more complex. Accordingly, an
automated anomaly detection approach is required to improve
data management and quality control processes. This study
introduces an unsupervised anomaly detection approach based
on models comparison, consensus learning, and a combination of
rules of thumb with iterative hyper-parameter tuning to increase
data quality. Furthermore, a domain expert is considered a
human in the loop to evaluate and check the data quality and to
judge the output of the unsupervised model. An experiment has
been conducted to assess the proposed approach in the context of
a case study. The experiment results confirm that the proposed
approach can improve the quality of
Files
2021.workshop.bigdata.midp21.camera.pdf
Files
(2.9 MB)
Name | Size | Download all |
---|---|---|
md5:51cca856e3286b38b7b738f7d43e9a86
|
2.9 MB | Preview Download |
Additional details
Funding
- CLARIFY – CLoud ARtificial Intelligence For pathologY 860627
- European Commission
- Blue Cloud – Blue-Cloud: Piloting innovative services for Marine Research & the Blue Economy 862409
- European Commission
- ARTICONF – smART socIal media eCOsytstem in a blockchaiN Federated environment 825134
- European Commission
- ENVRI-FAIR – ENVironmental Research Infrastructures building Fair services Accessible for society, Innovation and Research 824068
- European Commission