Published December 14, 2024
| Version 1.0
Physical object
Open
In-Graph-Database Implementation of a General, Reusable Graph Schema and a Modular Data Preprocessing Pipeline For Eye-Tracking Data
Description
Data gained through eye-tacking experiments is commonly delivered as CSV file. In order to store, manipulate and update this data in a single data model, we propose the usage of a general graph schema for all kind of eye-tracking data. This not only enables extending and updating the data but also makes cooperative work possible. As eye-tracking data resembles highly interconnected data, the usage of a graph database is beneficial.
- We propose a general, reusable graph schema to adequately handle eye-tracking data for any use case. This schema consists of two levels. A metadata level holding additional data about test persons (age, profession...) and a time series level with the eye-tracking data.
- To prepare the data for ML-based analysis, we implemented an in-graph-database data preprocessing pipeline with a human-in-the-loop approach. For each preprocessing step, at least two operators are available that can be chosen depending on the data and the use case.
All code snippets are implemented with Neo4j's query language Cypher.
Files
In-Graph-Database_Implementation_General_Graph_Schema_and_Modular_Data_Preprocessing_Pipeline_For_Eye-Tracking_Data.pdf
Files
(174.4 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:6460d33711317ad71705e26a84c2c42e
|
174.4 kB | Preview Download |
Additional details
References
- Taisir Alhilo and Akeel Al-Sakaa. "Handling Noisy Data in Eye-Tracking Research: Methods and Best Practices". In: 2023 International Workshop on Biomedical Applications, Technologies and Sensors (BATS). 2023, pp. 39– 44.
- Mariska E. Kret and Elio E. Sjak-Shie. "Preprocessing pupil size data: Guidelines and code". In: Behavior Research Methods 51.3 (June 2019), pp. 1336–1342.
- Ivan Miguel Pires et al. "Homogeneous Data Normalization and Deep Learning: A Case Study in Human Activity Classification". In: Future Internet 12.11 (Nov. 2020). Number: 11 Publisher: Multidisciplinary Digital Publishing Institute, p. 194.
- Zulfikar Setyo Priyambudi and Yusuf Sulistyo Nugroho. "Which algorithm is better? An implementation of normalization to predict student performance". In: AIP Conference Proceedings 2926.1 (Jan. 2024), p. 020110.
- H. P. Vinutha, B. Poornima, and B. M. Sagar. "Detection of Outliers Using Interquartile Range Technique from Intrusion Dataset". In: Information and Decision Sciences. Ed. by Suresh Chandra Satapathy et al. Singapore: Springer, 2018, pp. 511–518.
- Shichao Zhang, Zhi Jin, and Xiaofeng Zhu. "Missing data imputation by utilizing information within incomplete instances". In: Journal of Systems and Software 84.3 (Mar. 2011), pp. 452–459.
- Yifan Zhang and Peter J. Thorburn. "Handling missing data in near realtime environmental monitoring: A system and a review of selected methods". In: Future Generation Computer Systems 128 (Mar. 2022), pp. 63– 72.
- Jun Zhao, Wei Wang, and Chunyang Sheng. "Data Preprocessing Techniques". In: Data-Driven Prediction for Industrial Processes and Their Applications. Ed. by Jun Zhao, Wei Wang, and Chunyang Sheng. Cham: Springer International Publishing, 2018, pp. 13–52.