Published July 31, 2022 | Version v1
Journal article Open

DEEP LEARNING BASED MODEL FOR TABLE DETECTION CONTENT AND LAYOUT ANALYSIS IN COMPRESSED DOCUMENT IMAGES - A COMPREHENSIVE APPROACH

  • 1. Assistant Professor, Department of ISE, Global College of Engineering and Technology, Bangalore, India
  • 2. Professor, Department of ISE, SDM College of Engineering and Technology, Dharwad, India
  • 3. Principal Consultant, Infosys, Brookfield WI, USA

Description

Nowadays, the digital data is generated abruptly in the form of digital documents. Generation of large volumes of digital data giving rise to the big data problems has invited various problems which research community need to address. Exponential increase in the capacity of ‘Big-data’ containing images, textual information, audios and video content has paved a way to many challenges in processing because of an unstructured content. Due to large number of indexing and analyzing these images becomes a challenging issue. As there are various compression techniques available worldwide, these document images may undergo any compression before storage or transmission due to space and bandwidth issues. Once a document image is compressed it generates a compressed document image (CDI) which will have complexity in processing due to the loss of vital information present in it. Moreover, recognizing the layout of these documents is an important stage for various applications thus document layout analysis and recognition is considered as a promising solution for various computer vision based applications. Currently, deep learning schemes are widely adopted and comparative analysis has proven the accuracy of deep learning schemes. However, the accuracy of these systems is affected due to unstructured form of data. To overcome this issue, we present a novel scheme for layout and content equivalence analysis in compressed domain. The proposed approach uses a deep learning technique for detecting a table and faster RCNN based model for identifying the ROI. Moreover, this model incorporates the contextual information to improve the detection accuracy corresponding to each label in ROI. The proposed approach is tested by using publically available PubLayNet dataset. the average precision of PubLayNet dataset is obtained as 97.50%, F1-socre for DocBank is obtained as 97.09% and 96.55 mAP for DocBank. The comparative analysis proves that the proposed novel method attains better performance when compared with existing schemes.

Files

14Vol100No14.pdf

Files (1.3 MB)

Name Size Download all
md5:fda69219d29aec0b21842ef2c4291262
1.3 MB Preview Download