Report Open Access
Erasure coding, a new feature in HDFS, can reduce storage overhead by approximately 50%
compared to replication while maintaining the same durability guarantees. This would allow to
save a lot of disk capacity in needed by project hosted in CERN IT Hadoop service. The goal of
the project is to evaluate the new features of Hadoop 3 and make an assessment of its readiness
for production systems (this includes installation and configuration of a test hadoop3 cluster,
copying production data to it, conducting multiple performance test on the data).