Costa, Pedro A. R. S.
Ramos, Fernando M. V.
Correia, Miguel
2017-05-14
<p>MapReduce is a framework for processing large data sets much used in the context of cloud computing. MapReduce implementations like Hadoop can tolerate crashes and file corruptions, but not arbitrary faults. Unfortunately, there is evidence that arbitrary faults do occur and can affect the correctness of MapReduce job executions. Furthermore, many outages of major cloud offerings have been reported, raising concerns about the dependence on a single cloud. In this paper we propose a novel execution system that allows to scale out MapReduce computations to a cloud-of-clouds and tolerate arbitrary faults, malicious faults, and cloud outages. Our system, Chrysaor, is based on a fine-grained replication scheme that tolerates faults at the task level. Our solution has three important properties: it tolerates the above-mentioned classes of faults at reasonable cost; it requires minimal modifications to the users’ applications; and it does not involve changes to the Hadoop source code.We performed an extensive evaluation of our system in Amazon EC2, showing that our fine-grained solution is efficient in terms of computation by recovering only faulty tasks. This is achieved without incurring a significant penalty for the baseline case (i.e., without faults) in most workloads.</p>
https://doi.org/10.5281/zenodo.814856
oai:zenodo.org:814856
Zenodo
https://doi.org/10.5281/zenodo.897490
https://zenodo.org/communities/supercloud
https://zenodo.org/communities/eu
https://doi.org/10.5281/zenodo.814855
info:eu-repo/semantics/openAccess
Creative Commons Attribution Share Alike 4.0 International
https://creativecommons.org/licenses/by-sa/4.0/legalcode
CCGrid, IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Madrid, Spain, 14-17 May 2017
Chrysaor: Fine-Grained, Fault-Tolerant Cloud-of-Clouds MapReduce
info:eu-repo/semantics/conferencePaper