Intelligent Optical Character Recognition through CNN-LSTM Fusion with Dictionary Validation
| International Journal of Electronics and Communication Engineering |
| © 2026 by SSRG - IJECE Journal |
| Volume 13 Issue 2 |
| Year of Publication : 2026 |
| Authors : Naresh Kumar, S. Aparna |
How to Cite?
Naresh Kumar, S. Aparna, "Intelligent Optical Character Recognition through CNN-LSTM Fusion with Dictionary Validation," SSRG International Journal of Electronics and Communication Engineering, vol. 13, no. 2, pp. 1-10, 2026. Crossref, https://doi.org/10.14445/23488549/IJECE-V13I2P101
Abstract:
OCR has revolutionized the process of text extraction and digitization, which is playing a key role in industries including document processing, healthcare, and finance. Although such models have been developed, conventional OCR systems are usually not capable of handling mixed, low-quality, and noisy data. To overcome these limitations, the hybrid Convolutional Neural Networks (CNN) is employed to extract the spatial features in the most effective way, and Long Short-Term Memory (LSTM) networks are used to learn the sequential data. Standard preprocessing techniques of data (as normalization, augmentation, Isolation Forest-based outlier detection, etc.) are used to simplify the input data. Standard data preprocessing algorithms such as normalization, augmentation, and Isolation Forest-based outlier detection are applied to streamline the input data. A finite automata model represents the flow of data, and this gives a structured view of the model transitions. Also, a new confidence validation algorithm compares predictions to a medical dictionary, correcting low-confidence predictions, thus minimizing false predictions. The entire system of preprocessing has resulted in a 6.3% increase in accuracy when compared to simple methods of normalization. This research methodology has significantly enhanced high text recognition accuracy and reliability to more efficient OCR systems with the capability to be tailored to meet arduous real-world environments in applications requiring high accuracy, like domain-specific applications.
Keywords:
Optical Character Recognition, Hybrid CNN-LSTM Mode, Feature extraction, Finite automata, Isolation forest.
References:
[1] Lamia Mosbah et al., “ADOCRNet: A Deep Learning OCR for Arabic Documents Recognition,” IEEE Access, vol. 12, pp. 55620-55631, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Saad Mohamed Darwish, and Khaled Osama Elzoghaly, “An Enhanced Offline Printed Arabic OCR Model Based on Bio-Inspired Fuzzy Classifier,” IEEE Access, vol. 8, pp. 117770-117781, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Azimbek Khudoyberdiev, Ho Young Kim, and Jihoon Ryoo, “PLUS-CODE+: Zero-Installment Rover Indoor Localization,” IEEE Sensors Journal, vol. 25, no. 12, pp. 23088-23104, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Madan Lal Saini et al., “Handwritten English Script Recognition System Using CNN and LSTM,” 2024 IEEE International Conference on Contemporary Computing and Communications (InC4), Bangalore, India, pp. 1-6, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Senka Drobac, and Krister Lindén, “Optical Character Recognition with Neural Networks and Post-Correction with Finite State Methods,” International Journal on Document Analysis and Recognition, vol. 23, pp. 279-295, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Christoph Wick, Christian Reul, and Frank Puppe, “Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition,” arXiv preprint, pp. 1-12, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Santosh Khanal, and Rabindra Bista “A Hybrid Model for Deciphering Doctors' Handwriting Notes Recognition,” 2024 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET), Kota Kinabalu, Malaysia, pp. 466-470, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Manar Almanea, “Deep Learning in Written Arabic Linguistic Studies: A Comprehensive Survey,” IEEE Access, vol. 12, pp. 172196-172233, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Arinjay Wyawhare et al., “Improved Multilingual Text Identification using Embedding Visualization and Deep Learning Techniques,” 2024 International Conference on Recent Advances in Electrical, Electronics, Ubiquitous Communication, and Computational Intelligence (RAEEUCCI), Chennai, India, pp. 1-6, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Isuru Kavinda, and Harinda Fernando, “Handwritten Prescription Recognition Using VGG Based Architecture with Bi-LSTM,” 2024 International Research Conference on Smart Computing and Systems Engineering (SCSE), Colombo, Sri Lanka, pp. 1-6, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Esma F. Bilgin Tasdemir et al., “Automatic Transcription of Ottoman Documents Using Deep Learning,” Document Analysis Systems, vol. 14994, pp. 422-435, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[12] D. Sasikala, and Shaik Huzaifa Fazil, “Enhancing Communication: Utilizing Transfer Learning for Improved Speech-to-Text Transcription,” 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, pp. 1-6, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Quoc-Dung Nguyen et al., “An Efficient Unsupervised Approach for OCR Error Correction of Vietnamese OCR Text,” IEEE Access, vol. 11, pp. 58406-58421, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Aminul Islam, and Diana Inkpen “Real-Word Spelling Correction using Google web 1T N-Gram with Backoff,” 2009 International Conference on Natural Language Processing and Knowledge Engineering, Dalian, China, pp. 1689-1692, 2009.
[CrossRef] [Google Scholar] [Publisher Link]

10.14445/23488549/IJECE-V13I2P101