Journal article Open Access

On the Design of Resilient Multicloud MapReduce

Costa, Pedro A.R.S.; Ramos, Fernando M.V.; Correia, Miguel

MapReduce is a popular distributed data-processing system for analyzing big data in cloud environments. This platform is often used for critical data processing, e.g., in the context of scientific or financial simulation. Unfortunately, there is accumulating evidence of severe problems - including arbitrary faults and cloud outages - affecting the services that run atop cloud services. Faced with this challenge, we have recently explored multicloud solutions to increase the resilience and availability of MapReduce. Based on this experience, we present system design guidelines that allow to scale out MapReduce computation to multiple clouds in order to tolerate arbitrary and malicious faults, as well as cloud outages. Crucially, the techniques we introduce have reasonable cost and do not require changes to MapReduce or to the users’ code, enabling immediate deployment.

Files (272.0 kB)
Name Size
ResilientMapReduce.pdf
md5:1fe953b5402d1b8c1b841785746e5a6c
272.0 kB Download
24
18
views
downloads
Views 24
Downloads 18
Data volume 4.9 MB
Unique views 23
Unique downloads 18

Share

Cite as