Provenance-enhanced Root Cause Analysis for Jupyter Notebooks
- 1. University of Amsterdam
Description
With Jupyter notebooks becoming more commonly used within scientific research, more Jupyter notebook-based use cases have evolved to be distributed. This trend makes it more challenging to analyze anomalies and debug notebooks. Provenance data is an ideal option that can create more context around anomalies and make it easier to find the root cause of the anomaly. However, provenance rarely gets investigated in the context of distributed Jupyter notebooks. In this paper, we propose a framework that integrates two data types, provenance and detected performance anomalies based on performance data. We use the combined information to visually show the enduser the provenance at the time of the anomaly and the root cause of the anomaly. We build and evaluate the framework with a notebook extended with anomaly-generating functions. The generated anomalies were automatically detected, and the combined information of provenance and anomaly creates a valuable subset of the provenance data around the time an anomaly occurred. Our experiments create a clear and confined context for the anomaly and enable the framework to find the root cause of performance anomalies in Jupyter notebooks.
Files
2002.conference.ucc.intel4.camera.pdf
Files
(524.5 kB)
Name | Size | Download all |
---|---|---|
md5:435cc0f4e707f1e9897b9d19fb7e8337
|
524.5 kB | Preview Download |
Additional details
Funding
- Blue Cloud – Blue-Cloud: Piloting innovative services for Marine Research & the Blue Economy 862409
- European Commission
- ARTICONF – smART socIal media eCOsytstem in a blockchaiN Federated environment 825134
- European Commission
- ENVRI-FAIR – ENVironmental Research Infrastructures building Fair services Accessible for society, Innovation and Research 824068
- European Commission