"Adaptive Radial Projection on Fourier Magnitude Spectrum for Document Image Skew Estimation"
Creators
Description
We build DISE2021 datasets from 95 images from DISEC2013 dataset [12], 70 images from RDCL dataset [22], and 324 images from RVL-CDIP dataset [14]. The composed datasets contains various types of documents, multiple languages, and typography features. Firstly, all the images are ensured and verified to be in a straight position. Secondly, we use the generating algorithm as in [12] to generate skew images in the range −15 to +15 skew degree. The dataset is split into two development/test sets by a ratio of 0.7/0.3 that results in 3399 development images and 1491 testing images. When generating the skew dataset in the range from −44.9 to 44.9 skew degree, we double the augmented image that results in 6980 development images and 2800 testing images.
Note: This datasets are built upon three other datasets: DISEC 2013, RVL-CDIP, RDCL 2017. So I urge you to respect their LICENSE.
Files
dise2021_15.zip
Files
(2.5 GB)
Name | Size | Download all |
---|---|---|
md5:8afbea61c067aa1fc05e5e7b8324eabe
|
1.7 GB | Preview Download |
md5:b0520e2a953cf8cbf35c3609d7e43099
|
738.6 MB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/phamquiluan/jdeskew
- Development Status
- Active
References
- Pham, Luan, et al. "Adaptive Radial Projection on Fourier Magnitude Spectrum for Document Image Skew Estimation." 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2022.