Conference paper Open Access
When executing scientific workflows, anomalies of
the workflow behavior are often caused by different issues such
as resource failures at the underlying infrastructure. The provenance
information collected by workflow management systems
only captures the transformation of data at the workflow level.
Analyzing provenance information and apposite system metrics
requires expertise and manual effort. Moreover, it is often timeconsuming
to aggregate this information and correlate events
occurring at different levels of the infrastructure. In this paper,
we propose an architecture to automate the integration among
workflow provenance information and performance information
from the infrastructure level. Our architecture enables workflow
developers or domain scientists to effectively browse workflow
execution information together with the system metrics, and
analyze contextual information for possible anomalies.