Published September 30, 2020
| Version v1
Video/Audio
Open
Data Lakes for Digital Humanities
- 1. Université de Lyon, Lyon 2, ERIC UR 3083
- 2. Université de Lyon, Lyon 2, Laboratoire Cogitamus
Description
Traditional data in Digital Humanities projects bear various formats (structured, semi-structured, textual) and need substantial transformations (encoding and tagging, stemming, lemmatization, etc.) to be managed and analyzed. To fully master this process, we propose the use of data lakes as a solution to data siloing and big data variety problems. We describe data lake projects we currently run in close collaboration with researchers in humanities and social sciences and discuss the lessons learned running these projects.
Files
ddh20-darmont-favre-loudcher-nous.mp4
Files
(129.3 MB)
Name | Size | Download all |
---|---|---|
md5:c8cdb963c76b8f80787eed9c11f4ab5e
|
129.3 MB | Preview Download |
Additional details
Related works
- Is derived from
- Conference paper: 10.1145/3423603.3424004 (DOI)
References
- P. Liu, S. Loudcher, J. Darmont, E. Perrin, J.P. Girard, M.O. Rousset, "Metadata model for an archeological data lake", Digital Humanities (DH 2020), Ottawa, Canada, July 2020 (https://dh2020.adho.org/).
- P.N. Sawadogo, E. Scholly, C. Favre, E. Ferey, S. Loudcher, J. Darmont, "Metadata Systems for Data Lakes: Models and Features", 1st International Workshop on BI and Big Data Applications (BBIGAP@ADBIS 2019), Bled, Slovenia, September 2019; Communications in Computer and Information Science, Vol. 1064, Springer, Heidelberg, Germany, 440-451.
- P.N. Sawadogo, T. Kibata, J. Darmont, "Metadata Management for Textual Documents in Data Lakes", 21st International Conference on Enterprise Information Systems (ICEIS 2019), Heraklion, Crete-Greece, May 2019, 72-83; INSTICC, Setúbal, Portugal (Vol. 1).
- P.N. Sawadogo, J. Darmont, "On Data Lake Architectures and Metadata Management", Journal of Intelligent Information Systems, 2020