Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published June 27, 2024 | Version v1
Dataset Open

"Adaptive Radial Projection on Fourier Magnitude Spectrum for Document Image Skew Estimation"

Creators

Description

We build DISE2021 datasets from 95 images from DISEC2013 dataset [12], 70 images from RDCL dataset [22], and 324 images from RVL-CDIP dataset [14]. The composed datasets contains various types of documents, multiple languages, and typography features. Firstly, all the images are ensured and verified to be in a straight position. Secondly, we use the generating algorithm as in [12] to generate skew images in the range −15 to +15 skew degree. The dataset is split into two development/test sets by a ratio of 0.7/0.3 that results in 3399 development images and 1491 testing images. When generating the skew dataset in the range from −44.9 to 44.9 skew degree, we double the augmented image that results in 6980 development images and 2800 testing images. 

Note: This datasets are built upon three other datasets: DISEC 2013, RVL-CDIP, RDCL 2017. So I urge you to respect their LICENSE.

Files

dise2021_15.zip

Files (2.5 GB)

Name Size Download all
md5:8afbea61c067aa1fc05e5e7b8324eabe
1.7 GB Preview Download
md5:b0520e2a953cf8cbf35c3609d7e43099
738.6 MB Preview Download

Additional details

Software

Repository URL
https://github.com/phamquiluan/jdeskew
Development Status
Active

References

  • Pham, Luan, et al. "Adaptive Radial Projection on Fourier Magnitude Spectrum for Document Image Skew Estimation." 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2022.