Hybrid Approach for Word Recognition in Bilingual Natural Scene Images

International Journal of Electronics and Communication Engineering
© 2025 by SSRG - IJECE Journal
Volume 12 Issue 9
Year of Publication : 2025
Authors : Venkata B Hangarage, Gururaj Mukarambi
pdf
How to Cite?

Venkata B Hangarage, Gururaj Mukarambi, "Hybrid Approach for Word Recognition in Bilingual Natural Scene Images," SSRG International Journal of Electronics and Communication Engineering, vol. 12,  no. 9, pp. 98-107, 2025. Crossref, https://doi.org/10.14445/23488549/IJECE-V12I9P108

Abstract:

A multilingual country like India, where Kannada and English are the languages derived from the Brahmi and Latin scripts, respectively. In the Indian context, there is a need for Bilingual, Trilingual, and Multilingual word recognition in natural scene images to meet the requirements of a multilingual OCR system. Hence, a novel hybrid-based approach was proposed for extracting the features, such as the Resnet50 architecture in deep learning with 50 layers and utilizing residual learning through skip connections to enable efficient training of very deep networks. A total feature set of size 2048, after implementation of PCA, is reduced to 60 potential features. A dataset of 12,082 real-world sample images is collected, with diverse scenarios where bilingual text appears in various orientations, fonts, and complex backgrounds. In this paper, two experimental setups are carried out: a hybrid-based approach without PCA (Principal Component Analysis) and with PCA. The recognition accuracy was 97.48% using an SVM (Support Vector Machine) classifier without PCA and 97.56% with PCA, respectively. To test the performance of the Resnet50 model, a comparison is made with other pre-trained models like Vgg16, Google Net, Mobile Net, Efficient Net, and Vision Transformer, and later selected an optimum kernel like RBF with an SVM classifier to demonstrate the efficiency of the models. The novelty of this paper is the dimensionality reduction of weak features. The Time complexity of the SVM classifier with training and testing is reduced from 22% to 10% and it also demonstrates the capability of deep learning models to handle the complexities of bilingual text word recognition. It provides an effective solution in a multilingual environment.

Keywords:

Word Recognition, Bilingual, ResNet50, PCA.

References:

[1] Bayan M. Albalawi et al., “An End-to-End Scene Text Recognition for Bilingual Text,” Big Data and Cognitive Computing, vol. 8, no. 9, pp. 1-40, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Alex Noel Joseph Raj et al., “Bilingual Text Detection from Natural Scene Images using Faster R-CNN and Extended Histogram of Oriented Gradients,” Pattern Analysis and Applications, vol. 25, pp. 1001-1013, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Cunzhao Shi et al., “Scene Text Recognition using Part-Based Tree-Structured Character Detection,” 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, pp. 2961-2968, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Zheng Zhang, Yong Xu, and Cheng-Lin Liu, “Natural Scene Character Recognition using Robust PCA and Sparse Representation,” 2016 12th IAPR Workshop on Document Analysis Systems (DAS), Santorini, Greece, pp. 340-345, 2016. [CrossRef] [Google Scholar] [Publisher Link]
[5] Karan Maheshwari et al., “Bilingual Text Detection in Natural Scene Images using Invariant Moments,” Journal of Intelligent & Fuzzy Systems, vol. 37, no. 5, pp. 6773-6784, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Veronica Naosekpam, and Nilkanta Sahu, Multi-Label Indian Scene Text Language Identification: Benchmark Dataset and Deep Ensemble Baseline, 1st ed., Intelligent Systems and Applications in Computer Vision, CRC Press, pp. 1-21, 2023. [Google Scholar] [Publisher Link]
[7] Ankan Kumar Bhunia et al., “Script Identification in Natural Scene Image and Video Frames using an Attention based Convolutional-LSTM Network,” Pattern Recognition, vol. 85, pp. 172-184, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Ashwaq Khalil et al., “Text Detection and Script Identification in Natural Scene Images using Deep Learning,” Computers & Electrical Engineering, vol. 91, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Jingsi Zhang et al., “A Novel ResNet50-based Attention Mechanism for Image Classification,” Journal of Applied Science and Engineering, vol. 27, no. 8, pp. 2961-2969, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Ji Ma, and Yuyu Yuan, “Dimension Reduction of Image Deep Feature using PCA,” Journal of Visual Communication and Image Representation, vol. 63, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Karl Thurnhofer-Hemsi et al., “Radial basis Function Kernel Optimization for Support Vector Machine Classifiers,” arXiv Preprint, pp. 1-17, 2020.
[CrossRef] [Google Scholar] [Publisher Link]