Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.
Published May 14, 2017 | Version v1
Conference paper Open

Chrysaor: Fine-Grained, Fault-Tolerant Cloud-of-Clouds MapReduce

  • 1. Universidade de Lisboa

Description

MapReduce is a framework for processing large data sets much used in the context of cloud computing. MapReduce implementations like Hadoop can tolerate crashes and file corruptions, but not arbitrary faults. Unfortunately, there is evidence that arbitrary faults do occur and can affect the correctness of MapReduce job executions. Furthermore, many outages of major cloud offerings have been reported, raising concerns about the dependence on a single cloud. In this paper we propose a novel execution system that allows to scale out MapReduce computations to a cloud-of-clouds and tolerate arbitrary faults, malicious faults, and cloud outages. Our system, Chrysaor, is based on a fine-grained replication scheme that tolerates faults at the task level. Our solution has three important properties: it tolerates the above-mentioned classes of faults at reasonable cost; it requires minimal modifications to the users’ applications; and it does not involve changes to the Hadoop source code.We performed an extensive evaluation of our system in Amazon EC2, showing that our fine-grained solution is efficient in terms of computation by recovering only faulty tasks. This is achieved without incurring a significant penalty for the baseline case (i.e., without faults) in most workloads.

Files

Document_for_Publication-2017_Costa.pdf

Files (446.0 kB)

Name Size Download all
md5:6501ce97c2b943ff6d7511214cd2496f
446.0 kB Preview Download

Additional details

Related works

Is supplemented by
10.5281/zenodo.897490 (DOI)

Funding

SUPERCLOUD – USER-CENTRIC MANAGEMENT OF SECURITY AND DEPENDABILITY IN CLOUDS OF CLOUDS 643964
European Commission