A Faster RCNN Based Image Text Detection and Text to Speech Conversion

International Journal of Electronics and Communication Engineering
© 2018 by SSRG - IJECE Journal
Volume 5 Issue 5
Year of Publication : 2018
Authors : Abitha A and Lincy K
How to Cite?

Abitha A and Lincy K, "A Faster RCNN Based Image Text Detection and Text to Speech Conversion," SSRG International Journal of Electronics and Communication Engineering, vol. 5,  no. 5, pp. 11-14, 2018. Crossref, https://doi.org/10.14445/23488549/IJECE-V5I5P103


The reading of text contained in images plays an important role in understanding the contents of images. Text found in images contain important contents for information indexing and retrieval, structuring and automatic annotation of images. Hence text detection is the crucial stage of analyzing the images and is a well-known problem in the computer vision research area. Text detection is a very challenging task due to the variations in text size, font, style, orientation, alignment and complex background. The goal of this system is to detect the text regions in images accurately and convert the detected text to speech. The text to speech conversion process is done after text recognition from the detected text regions. In this system, a technique based on faster region based convolution neural network is proposed for image text detection. Then the detected text is converted to speech using MATLAB.



Faster RCNN (Region based Convolutional Neural Network); text recognition; text to speech conversion.


[1] Jian Sun, Ross Girshick, Kaiming He, Shaoqing Ren, “Faster R-CNN: Towards Real- Time Object Detection with Region Proposal Networks” Microsoft Research {v-shren, kahe, rbg, jiansun}@microsoft.com 
[2] Prince saini, Rajesh Mehra,"Text to Speech Conversion using Optical character Recognition for Visually Impaired Persons"International Journal of Computer Trends and Technology (IJCTT),Volume-29 Number-2,2015.
[3] Ross Girshick, Jeff Donahue, Trevor Darrell , Jitendra Malik “Region-based Convolutional Networks for Accurate Object Detection and Segmentation” DOI 10.1109/TPAMI.2015.2437384, IEEE Transactions on Pattern Analysis and Machine Intelligence 
[4] Ms. Dipalee A. Kolte, Prof. Maruti B. Limkar, Prof. Sanjay M. Hundiwale," Character recognition from deblurred motion distorted Vehicle image using Neural Network",International Journal of Electronics and Communication Engineering (SSRG - IJECE),Volume1 Issue4 – 2014. 
[5] Fei-Fei Li, Justin Johnson, Serena Yeung, “Convolutional Neural Networks”, Lecture 5-1, April 18, 2017 
[6] Karen Simonyan and Andrew Zisserman, Visual Geometry Group, Department of Engineering Science, University of Oxford, “Very Deep Convolutional Networks For Large-Scale Image Recognition”, ICLR 2015 
[7] R.Dhanujalakshmi, B.Divya, C.Divya@sandhiya, A.Robertsingh,"Image Processing Based Fire Detection System using Rasperry Pi System",International Journal of Computer Science and Engineering (SSRG-IJCSE),Volume-4 Issue-4 ,2017. 
[8] Ray Smith. Hybrid Page Layout Analysis via Tab-Stop Detection. Proceedings of the 10th international conference on document analysis and recognition. 2009 
[9] Miss Hetal J. Vala, Prof. Astha Baxi, “A Review on Otsu Image Segmentation Algorithm:”, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) Volume 2, Issue 2, February 2013 
[10] Thierry Dutoit, Milos Cernak, “TTSBOX: A Matlab Toolbox For Teaching Text-To-Speech Synthesis”, IEEE, ICASSP 2005