Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published August 29, 2019 | Version 1.0
Conference paper Open

Optimum Interval for Application-level Checkpoints

  • 1. Imperial College London, London, United Kingdom
  • 2. Institute of Theoretical & Applied Informatics, Polish Academy of Sciences, Gliwice, Poland

Description

Checkpointing is commonly adopted for enhancing the performance of software applications that operate in the presence of failures. Among the existing checkpointing strategies, Application-level Checkpoint and Restart (ALCR) is considered the most efficient, since it leaves smaller memory footprint, but it requires significant development effort. Although existing ALCR tools and libraries manage to reduce the effort required for implementing the checkpoints, they do not provide recommendations regarding their inter-checkpoint interval. To this end, in the present paper, we develop a mathematical model to estimate the optimum checkpoint interval, i.e., the interval between two successive checkpoints that minimises the average execution time of the application. The case of programs with loops and nested loops is also discussed. The results are illustrated with several numerical examples.

Files

PCSCloud__Optimum_Interval.pdf

Files (454.6 kB)

Name Size Download all
md5:fe22161de1e453f4c0fbaff9c94ed201
454.6 kB Preview Download

Additional details

Funding

SDK4ED – Software Development toolKit for Energy optimization and technical Debt elimination 780572
European Commission