Automatic Subtitle Generation for English Language Videos

AkshayJakhotiya, KetanKulkarni, ChinmayInamdar, BhushanMahajan, AlkaLondhe

Citation :

AkshayJakhotiya, KetanKulkarni, ChinmayInamdar, BhushanMahajan, AlkaLondhe, "Automatic Subtitle Generation for English Language Videos," International Journal of Computer Science and Engineering , vol. 2, no. 10, pp. 5-7, 2015. Crossref, https://doi.org/10.14445/23488387/IJCSE-V2I10P102

Abstract

The use of videos for the purpose of communication has witnessed a phenomenal growth in the past few years. However, non-native language speakers or people with hearing disabilities are unable to take advantage of this powerful medium of communication. To overcome the problems caused by hearing disabilities or language barrier, subtitles are provided for videos. The subtitles are provided in the form of a subtitle file most commonly having a .srt extension. Several software have been developed for manually creating subtitle file, however software for automatically generating subtitles are scarce. In this paper, we introduce a system that we have envisioned will generate subtitles automatically through a 3- stage process: Audio extraction, Speech recognition and Subtitle synchronization.

Keywords

Subtitles, .srt file, Audio Extraction, Speech Recognition, ffmpeg, JAVE, CMU Sphinx.

References

[1] J.O. Djan and R. Shipsey, “E- Subtitles: Emotional Subtitles as a Technology to assist the Deaf and Hearing-Impaired when Learning from Television and Film”, in Sixth International Conference on Advanced Learning Technologies, pp.464-466, 2006
[2] A. Mathur, T. Saxena, R. Krishnamurthi, “Generating Subtitles Automatically using Audio Extraction and Speech Recognition”, 2015 IEEE International Conference on Computational Intelligence & Communication Technology, pp.621-626
[3]Suman K. Saksamudre1 and R. R. Deshmukh, “Isolated Word Recognition System for Hindi Language.“, International Journal of Computer Science andEngineering, Vol. 3,Issue.7,2015, pp. 110- 114.
[4]Sukhandeep Kaur1 and Kanwalvir Singh Dhindsa, “Speaker Recognition System Techniques and Applications.“, International Journal of Computer Science andEngineering , Vol. 3,Issue.8,2015, pp. 101-104.
[5]http://www.matroska.org/technical/specs/subtitles/srt.html, accessed October 2015
[6] http://gnome-subtitles.sourceforge.net, accessed October 2015
[7] https://www.ffmpeg.org, accessed October 2015
[8] https://trac.ffmpeg.org/wiki/Projects, accessed October 2015
[9] http://www.sauronsoftware.it/projects/jave/index.php, accessed October 2015
[10] http://cmusphinx.sourceforge.net/doc/sphinx4, accessed October 2015
[11]http://sourceforge.net/projects/cmusphinx/files/Acoustic%20an d%20Language%20Models/, accessed October 2015
[12]http://www.oracle.com/technetwork/java/javadb/overview/inde x.html, accessed October 2015
[13]http://home.gna.org/subtitleeditor, accessed October 2015
[14]http://home.gna.org/gaupol, accessed October 2015