Published August 1, 2013 | Version 16836
Journal article Open

Optimizing Hadoop Block Placement Policy and Cluster Blocks Distribution

Description

The current Hadoop block placement policy do not fairly and evenly distributes replicas of blocks written to datanodes in a Hadoop cluster.

This paper presents a new solution that helps to keep the cluster in a balanced state while an HDFS client is writing data to a file in Hadoop cluster. The solution had been implemented, and test had been conducted to evaluate its contribution to Hadoop distributed file system.

It has been found that, the solution has lowered global execution time taken by Hadoop balancer to 22 percent. It also has been found that, Hadoop balancer respectively over replicate 1.75 and 3.3 percent of all re-distributed blocks in the modified and original Hadoop clusters.

The feature that keeps the cluster in a balanced state works as a core part to Hadoop system and not just as a utility like traditional balancer. This is one of the significant achievements and uniqueness of the solution developed during the course of this research work.

Files

16836.pdf

Files (620.7 kB)

Name Size Download all
md5:1574d8855c81b61584b0f35d0b438533
620.7 kB Preview Download

Additional details

References