Published October 12, 2017 | Version v1
Journal article Open

On the Design of Resilient Multicloud MapReduce

  • 1. LaSIGE, Faculdade de Ciências, Universidade de Lisboa – Portugal
  • 2. INESC-ID, Instituto Superior Técnico, Universidade de Lisboa – Portugal

Description

MapReduce is a popular distributed data-processing system for analyzing big data in cloud environments. This platform is often used for critical data processing, e.g., in the context of scientific or financial simulation. Unfortunately, there is accumulating evidence of severe problems - including arbitrary faults and cloud outages - affecting the services that run atop cloud services. Faced with this challenge, we have recently explored multicloud solutions to increase the resilience and availability of MapReduce. Based on this experience, we present system design guidelines that allow to scale out MapReduce computation to multiple clouds in order to tolerate arbitrary and malicious faults, as well as cloud outages. Crucially, the techniques we introduce have reasonable cost and do not require changes to MapReduce or to the users’ code, enabling immediate deployment.

Files

ResilientMapReduce.pdf

Files (272.0 kB)

Name Size Download all
md5:1fe953b5402d1b8c1b841785746e5a6c
272.0 kB Preview Download

Additional details

Related works

Is supplemented by
10.5281/zenodo.1049591 (DOI)

Funding

SUPERCLOUD – USER-CENTRIC MANAGEMENT OF SECURITY AND DEPENDABILITY IN CLOUDS OF CLOUDS 643964
European Commission