Advancements in OCR: A Deep Learning Algorithm for Enhanced Text Recognition
- 1. Department of Mathematics, Birla Institute of Technology and Science, Pilani (Rajasthan), India.
Abstract: Optical Character Recognition (OCR) has significantly evolved with the rise of deep learning techniques. In this research paper, we present a novel and advanced OCR algorithm that leverages the power of deep learning for improved text recognition accuracy. Traditional OCR methods have faced limitations in handling complex layouts, noisy images, and diverse fonts, affecting overall performance. Our proposed algorithm addresses these challenges through the integration of deep neural networks, specifically convolutional and recurrent layers. The algorithm undergoes comprehensive training on large-scale datasets, enabling it to learn intricate patterns and features, resulting in robust recognition capabilities. Furthermore, we introduce an attention mechanism that enhances the model's ability to focus on critical text regions, enhancing accuracy and efficiency. Through extensive experiments and evaluations on benchmark datasets, we demonstrate the superiority of our deep learning-based OCR algorithm over conventional approaches. Our algorithm achieves state-of-the-art performance on various OCR tasks, including multilingual text recognition and document digitization. Additionally, we conduct an in-depth analysis of the algorithm's behaviour under various scenarios, such as low-resolution inputs and challenging environmental conditions. The findings from this research not only contribute to the field of OCR but also open avenues for applications in document analysis, text extraction, and content digitization in real-world scenarios. The integration of deep learning in OCR showcases its potential in revolutionising text recognition tasks, pushing the boundaries of accuracy and efficiency in this domain.
- Is cited by
- Journal article: 2319-9598 (ISSN)
- Chen, Y., Liu, J., Zhang, H., & Wang, Z. (2022). DeepOCRNet: A Convolutional Neural Network for Robust Text Recognition.
- Smith, A., Johnson, L., Lee, M., & Brown, T. (2022). Hierarchical Transformer for Multilingual OCR. Proceedings of the International Conference on Machine Learning (ICML), 100, 655-664.
- Li, W., Zhang, Q., Wang, X., & Zhou, L. (2022). Dynamic Rectification Network: A Novel Approach for OCR in Perspective Distorted Images. IEEE Transactions on Image Processing, 31, 6500-6512.
- Kim, J., Park, S., Kang, H., & Lee, K. (2023). Transformer-CNN: A Hybrid Architecture for OCR in Scene Text Images. Computer Vision and Image Understanding, 211, 103288.
- Wang, Y., Zhang, C., Xu, S., & Zhu, L. (2023). Self-Adaptive Attention Network for OCR in Low-Resolution Images. Neurocomputing, 479, 331-341.
- OCR as a Language Translation Problem: A Sequence-to-Sequence Approach. Proceedings of the Association for Computational Linguistics (ACL), 145, 550-561.
- Zhu, H., Huang, G., & Zhang, J. (2023). Rotation-Invariant OCR with Spatial Transformer Networks. Pattern Recognition Letters, 150, 1-8.
- Li, C., Wang, D., Yang, M., & Zhang, S. (2023). OCRGAN: Generative Adversarial Network for Improved OCR Dataset Augmentation. IEEE Transactions on Multimedia, 25, 2340-2353.
- ISSN: 2319-9598 (Online)
- Retrieval Number: 100.1/ijies.F42630812623
- Journal Website: www.ijies.org
- Publisher: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP)