Discrete Wavelet Transform with Thresholding: An Effective Speech De-Noising Algorithm

International Journal of Electrical and Electronics Engineering
© 2023 by SSRG - IJEEE Journal
Volume 10 Issue 1
Year of Publication : 2023
Authors : Jagadish S.Jakati, Veereshkumar Mathad, Kiran S Nandi, Anandraddi Naduvinmani
pdf
How to Cite?

Jagadish S.Jakati, Veereshkumar Mathad, Kiran S Nandi, Anandraddi Naduvinmani, "Discrete Wavelet Transform with Thresholding: An Effective Speech De-Noising Algorithm," SSRG International Journal of Electrical and Electronics Engineering, vol. 10,  no. 1, pp. 138-147, 2023. Crossref, https://doi.org/10.14445/23488379/IJEEE-V10I1P113

Abstract:

The research of speech augmentation has become increasingly popular in the domain of speech processing. It mainly concentrates on removing the voice stream's additive background noise, significantly degrading speech interpretability. The objective of speech enhancement is to eliminate additive noise from the speech signal and restore the original signal. There are presented methods for improving speech based on speech and noise signals' perceptual, auditory, or statistical limitations. Predicting the features of the voice signal and any background noise is quite difficult in a decent environment. Speech processing is challenging due to the absence of a specific framework for the speech signal and a cognitively important distortion scale. Speech transmissions are also, by nature, non-stationary. Consequently, adaptive estimate methods that don't require an explicit predictive method for the underlying signal statistics typically overlook changes. Therefore, by utilizing voice enhancement techniques, signal noise can be somewhat decreased. Additionally, there is a trade-off between the amount of noise suppressed and the irregularities in the voice signal produced. This study aims to provide an efficient method for examining voice augmentation techniques. Another problem is the simplicity with which noise-suppression algorithms can be applied in mobile phones and digital hearing aids. New strategies are needed to improve the effectiveness of speech enhancement technologies in light of the aforementioned limitations. Due to their excellent efficiency, transform domain filters are frequently used in this study's speech improvement process.

Keywords:

Speech processing, DCT, Discrete Wavelet Transform, Thresholding, Signal to noise ratio, MoS. LLR ISD, etc.

References:

[1] Tusar Kanti Dash, and Sandeep Singh Solanki, “Comparative Study of Speech Enhancement Algorithms and Their Effect on Speech Intelligibility,” 2017 2nd International Conference on Communication and Electronics Systems, IEEE, pp. 270-276, 2017. Crossref, http://doi.org/10.1109/CESYS.2017.8321280
[2] Anamika Baradiya, and Vinay Jain, "Speech and Speaker Recognition Technology using MFCC and SVM," SSRG International Journal of Electronics and Communication Engineering, vol. 2, no. 5, pp. 6-9, 2015. Crossref, https://doi.org/10.14445/23488549/IJECE-V2I5P105
[3] Kristian Timm Andersen, and Marc Moonen, “Robust Speech-Distortion Weighted Interframe Wiener Filters for Single-Channel Noise Reduction,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 1, pp. 97-107, 2018. Crossref, http://doi.org/10.1109/TASLP.2017.2761699
[4] Gerald Enzner, and Philipp Thüne, “Robust MMSE Filtering for Single-Microphone Speech Enhancement,” 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, pp. 4009-4013, 2017. Crossref, http://doi.org/10.1109/ICASSP.2017.7952909
[5] Yasser Ghanbari, and Mohammad Reza Karami-Mollaei, “A New Approach for Speech Enhancement Based on the Adaptive Thresholding of the Wavelet Packets,” Speech Communication, vol. 48, no. 8, pp. 927-940, 2006. Crossref, https://doi.org/10.1016/j.specom.2005.12.002
[6] Sumer Singh Singhwal, “Noise Reduction from Speech Signal Using MATLAB and Wavelet Transform,” The Engineering Journal of Application & Scopes, vol. 1, no. 1, 2016.
[7] Meet H. Soni, and Hemant A. Patil, “Novel Subband Autoencoder Features for Non-Intrusive Quality Assessment of Noise Suppressed Speech,” Interspeech, pp. 3708-3712, 2016. Crossref, https://doi.org/10.21437/Interspeech.2016-693
[8] Rajesh Kumar Dubey, and Arun Kumar, “Non‐Intrusive Speech Quality Assessment Using Multi‐Resolution Auditory Model Features for Degraded Narrowband Speech,” IET Signal Processing, vol. 9, no. 9, pp. 638-646, 2015. Crossref, https://doi.org/10.1049/iet-spr.2014.0214
[9] Dr. S.D. Apte, “Speech Enhancement in Hearing Aids Using Conjugate Symmetry Property of Short Time Fourier Transform,” International Journal of Recent Trends in Engineering, vol. 2, no. 5, pp. 346-351, 2009.
[10] S D Apte, and Shridhar, “Speech Enhancement in Hearing Aids Using Conjugate Symmetry of DFT and SNR-Perception Models,” International Journal of Computer Applications, vol. 1, no. 21, pp. 44-51, 2010. Crossref, https://doi.org/10.5120/58-650
[11] A. M. Mutawa, “Improving Patient Voice Intelligibility by using a Euclidian Distance-based Approach to Improve Voice Assistant Accuracy,” International Journal of Circuits, Systems and Signal Processing, pp. 329-339, vol. 14, 2020. Crossref, https://doi.org/10.46300/9106.2020.14.45
[12] Ch. D. Umasankar, and M Satya Sairam, “Performance Analysis of LMS, NLMS Adaptive Algorithms for Speech Enhancement in Noisy Environment,” International Journal of Innovative Technology and Exploring Engineering, vol. 9, no. 4, pp. 2330-2333, 2020. Crossref, https://doi.org/10.35940/ijitee.D1864.029420
[13] V.A.Mane et al., “Comparison of LDM and LMS for an Application of a Speech,” International Journal on Signal Processing, vol. 5, no. 4, pp. 130-141, 2011.
[14] M. A. Ali, and P. M. Shemi, “An Improved Method of Audio Denoising Based on Wavelet Transform,” IEEE International Conference on Power, Instrumentation, Control and Computing, pp. 1-6, 2015. Crossref, https://doi.org/10.1109/PICC.2015.7455802
[15] Pinki Sahil Gupta, “Speech Enhancement using Spectral Subtraction Type Algoritms: A Survey on Comparison,” International Journal of Engineering and Computer Science, vol. 4, no. 10, 2015.
[16] Prashanth Kannadaguli, and Vidya Bhat, "Phoneme Modeling for Speech Recognition in Kannada using Multivariate Bayesian Classifier," SSRG International Journal of Electronics and Communication Engineering, vol. 1, no. 9, pp. 1-4, 2014. Crossref, https://doi.org/10.14445/23488549/IJECE-V1I9P101
[17] J Indra et al., “A Modified Tunable–Q Wavelet Transform Approach for Tamil Speech Enhancement,” IETE Journal of Research, pp. 1- 14, 2020.
[18] K.Sureshkumar, and Dr.P.Thatchinamoorthy, "Speech and Spectral Landscapes using Mel-Frequency Cepstral Coefficients Signal Processing," SSRG International Journal of VLSI & Signal Processing, vol. 3, no. 1, pp. 5-8, 2016. Crossref, https://doi.org/10.14445/23942584/IJVSP-V3I1P102
[19] Weili Zhou, and Zhen Zhu, “A Novel BNMF-DNN Based Speech Reconstruction Method for Speech Quality Evaluation under Complex Environments,” International Journal of Machine Learning and Cybernetics, vol. 12, no. 4, pp. 959-972, 2021. Crossref, https://doi.org/10.1007/s13042-020-01214-3
[20] Szu-Wei Fu et al., “Quality-Net: An End-to-End Non-Intrusive Speech Quality Assessment Model Based on BLSTM,” ArXiv preprint arXiv:1808.05344. Crossref, https://doi.org/10.48550/arXiv.1808.05344
[21] Zeng Runhua, and Zhang Shuqun, "Improving Speech Emotion Recognition Method of Convolutional Neural Network," International Journal of Recent Engineering Science, vol. 5, no. 3, pp. 1-7, 2018. Crossref, https://doi.org/10.14445/23497157/IJRES-V5I3P101
[22] Jagadish S.Jakati, and Shridhar S.Kuntoji, “Novel Speech Enhancement Solution Using Hybrid Wavelet Transformation Least Means Square Method,” International Journal of Engineering Trends and Technology, vol. 69, no. 7, pp. 233-243, 2021. Crossref, https://doi.org/10.14445/22315381/IJETT-V69I7P230
[23] Kristian Timm Andersen, and Marc Moonen, “Adaptive Time-Frequency Analysis for Noise Reduction in an Audio Filter Bank with Low Delay,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, no. 4, pp. 784-795, 2016. Crossref, https://doi.org/10.1109/TASLP.2016.2526779
[24] Manjeet Singh, and Er.Naresh Kumar Garg, "Audio Noise Reduction Using Butter worth Filter," International Journal of Computer & Organization Trends, vol. 4, no. 2, pp. 20-23, 2014.
[25] Ying Deng, V. John Mathews, and Behrouz Farhang-Boroujeny, “Low-Delay Nonuniform Pseudo-QMF Banks with Application to Speech Enhancement,” IEEE Transactions on Signal Processing, vol. 55, no. 5, pp. 2110-2121, 2007.
[26] Karush Suri, "Sub - Band Coding and Speech Quality Testing," SSRG International Journal of Electronics and Communication Engineering, vol. 3, no. 1, pp. 10-13, 2016. Crossref, https://doi.org/10.14445/23488549/IJECE-V3I1P104
[27] Jagadish S. Jakati, and Shridhar S. Kuntoji, “A Noise Reduction Method Based on Modified LMS Algorithm of Real Time Speech Signals,” WSEAS Transactions on Environment and Development, vol. 16, no. 13, pp. 162-170, 2021. Crossref, https://doi.org/10.37394/23203.2021.16.13
[28] Fairriky Rynjah, Bronson Syiem, and L. Joyprakash Singh, "Investigating Khasi Speech Recognition Systems using a Recurrent Neural Network-Based Language Model," International Journal of Engineering Trends and Technology, vol. 70, no. 7, pp. 269-274, 2022. Crossref, https://doi.org/10.14445/22315381/IJETT-V70I7P227
[29] Jagadish S.Jakati, and Shridhar S.Kuntoji, “Efficient Speech De-Noising Algorithm Using Multi-Level Discrete Wavelet Transform and Thresholding,” International Journal of Emerging Trends in Engineering Research, vol. 8, no. 6, pp. 2472-2480, 2020. Crossref, https://doi.org/10.30534/ijeter/2020/43862020
[30] Sara Sandabad, Achraf Benba, and Hasna Nhaila, "Parkinson's Syndrome Diagnosis Applying Perceptual Linear Prediction Cepstral Analysis on Several Speech Recordings," International Journal of Engineering Trends and Technology, vol. 70, no. 9, pp. 214-221, 2022. Crossref, https://doi.org/10.14445/22315381/IJETT-V70I9P222
[31] Anil Garg, and O. P. Sahu, “A Hybrid Approach for Speech Enhancement Using Bionic Wavelet Transform and Butterworth Filter," International Journal of Computers and Applications, vol. 42, no. 7, pp. 686-696, 2020. Crossref, https://doi.org/10.1080/1206212X.2019.1614293
[32] Hyeong-Seok Choi et al., “Phase-Aware Speech Enhancement with Deep Complex U-Net,” International Conference on Learning Representations, 2019.
[33] Rafael Attili Chiea, Márcio Holsbach Costa, and Guillaume Barraultb, “New Insights on the Optimality of Parameterized Wiener Filters for Speech Enhancement Applications,” Speech Communication, vol. 109, pp. 46-54, 2019. Crossref, https://doi.org/10.1016/j.specom.2019.03.005