PROCESSING IMAGE FILES USING SEQUENCE FILE IN HADOOP

Dr. E. Laxmi Lydia*, Dr. A. Krishna Mohan, Dr. M. Ben Swarup

doi:10.5281/zenodo.160893

Published October 15, 2016 | Version v1

Journal article Open

PROCESSING IMAGE FILES USING SEQUENCE FILE IN HADOOP

Dr. E. Laxmi Lydia*, Dr. A. Krishna Mohan, Dr. M. Ben Swarup

This paper presents MapReduce as a distributed data processing model utilizing open source Hadoop framework for work huge volume of data. The expansive volume of data in the advanced world, especially multimedia data, makes new requirement for processing and storage. As an open source distributed computational framework, Hadoop takes into consideration processing a lot of images on an unbounded arrangement of computing nodes by giving fundamental foundations. We have lots and lots of small images files and need to remove duplicate files from the available data. As most binary formats—particularly those that are compressed or encrypted—cannot be split and must be read as a single linear stream of data. Using such files as input to a MapReduce job means that a single mapper will be used to process the entire file, causing a potentially large performance hit. The paper proposes splitable format such as SequenceFile and uses MD5 algorithm to improve the performance of image processing.

Files

Files (916.4 kB)

Name	Size	Download all
Laxmi E.docx md5:765bae98bb1b7ced859d9541443ab21e	916.4 kB	Download

Views

676

Downloads

Show more details

	All versions	This version
Views	62	62
Downloads	676	676
Data volume	650.6 MB	650.6 MB

More info on how stats are collected....

DOI

Resource type

Journal article

Publisher

Zenodo

Published in

INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY, 5(10), 521-528, 2016.

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: October 15, 2016
Modified: January 20, 2020

PROCESSING IMAGE FILES USING SEQUENCE FILE IN HADOOP

Authors/Creators

Description

Files

Files (916.4 kB)