An Analysis of Machine Learning Classifiers for Stress Detection Using Audio Features from TESS and RAVDESS Datasets

International Journal of Electrical and Electronics Engineering |
© 2025 by SSRG - IJEEE Journal |
Volume 12 Issue 7 |
Year of Publication : 2025 |
Authors : Smita Sagar Patil, Meena Chavan |
How to Cite?
Smita Sagar Patil, Meena Chavan, "An Analysis of Machine Learning Classifiers for Stress Detection Using Audio Features from TESS and RAVDESS Datasets," SSRG International Journal of Electrical and Electronics Engineering, vol. 12, no. 7, pp. 123-146, 2025. Crossref, https://doi.org/10.14445/23488379/IJEEE-V12I7P109
Abstract:
This research addresses the critical need for accurate stress detection using speech signals, leveraging Machine Learning (ML) approaches applied to two distinct datasets: RAVDESS and TESS. Stress detection is pivotal in mental health monitoring and human-computer interaction; however, existing solutions often fail to generalize across diverse datasets due to the varying emotional complexities. The research gap lies in developing robust ML frameworks capable of handling nuanced emotional features, especially from datasets like RAVDESS, which exhibit significant overlap in stress-related signals. Comprehensive audio features, including Zero Crossing Rate (ZCR), Root Mean Square Energy (RMSE), Spectral Centroid, Spectral Bandwidth, Spectral Contrast, Spectral Rolloff, and Chroma features, are extracted to capture critical frequency and energy patterns. The study employs a suite of ML classifiers such as Random Forest (RF), Logistic Regression (LoR), Gradient Boosting (GB), K-Nearest Neighbors (KNN), Naïve Bayes (NB), and Support Vector Machines (SVM) with various kernels, along with an ensemble Voting Classifier. Among the models, SVM (linear) and Voting Classifier performed best, achieving 100% accuracy on TESS and up to 88.97% on RAVDESS. In contrast, NB showed lower performance, particularly on RAVDESS, with an accuracy of 72.06%. These findings reflect the sensitivity of model performance to dataset complexity and class separability. The significance of this study is in highlighting the impact of dataset characteristics on ML performance, providing a framework for feature extraction and model selection. Enhanced results confirm the necessity of tailored approaches for stress detection, paving the way for more sophisticated, dataset-aware methodologies to expand accuracy and reliability in real-world applications.
Keywords:
Machine Learning, Stress detection, RAVDESS, TESS, Voting Classifier, K-Nearest Neighbors, Gradient Boosting, Support Vector Machines.
References:
[1] Serhat Hızlısoy, and Zekeriya Tüfekci, “Emotion Recognition from Turkish Music,”European Journal of Science and Technology, pp. 6-12, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Engin Demir, and Abdulkadir Tepecik “Analysis of Turkish Voice Recording Data with CountVectorizer and TF-IDFVectorizer Methods as BERT Models on Google Colab Platform and RapidMiner with Machine Learning Algorithms,” Firat University Journal of Science, vol. 34, no. 1, PP. 19-29, 2022.
[Google Scholar] [Publisher Link]
[3] Recep Sinan Arslan, and Necaattin Barişçi, “The Effect of Different Optimization Techniques on End-to-End Turkish Speech Recognition Systems that Use Connectionist Temporal Classification,” In IEEE 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, pp. 1-6, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Kishor B. Bhangale, and Mohanaprasad Kothandaraman, “Speech Emotion Recognition Using the Novel PEmoNet (Parallel Emotion Network),” Applied Acoustics, vol. 212, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Emel Colakoglu, Serhat Hizlisoy, and Recep Sinan Arslan, T-ser: An Efficient Speech Emotional Recognition Model for Turkish Language Based on Machine Learning Algorithms, Innovations and Technologies in Engineering, Education Publishing House, İstanbul, pp.106-127, 2022.
[Google Scholar] [Publisher Link]
[6] V.R. Archana, and B.M. Devaraju, “Stress Detection Using Machine Learning Algorithms,” International Journal of Research in Engineering, Science and Management, vol. 3, no. 8, pp. 251-256, 2020.
[Google Scholar] [Publisher Link]
[7] Xiyuan Hou et al., “EEG Based Stress Monitoring,” In 2015 IEEE International Conference on Systems, Man, and Cybernetics, Hong Kong, China, pp. 3110-3115, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Scott M. Monroe, “Modern Approaches to Conceptualizing and Measuring Human Life Stress,” Annual Review of Clinical Psychology, vol. 4, no. 1, pp. 33-52, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Thomas H. Holmes, and Richard H. Rahe, “The Social Readjustment Rating Scale,” Journal of Psychosomatic Research, vol. 11, no. 2, pp. 213-218, 1967.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Giorgos Giannakakis et al., “Review on Psychological Stress Detection Using Biosignals,” IEEE Transactions on Affective Computing, vol. 13, no. 1, pp. 440-460, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Eri Koibuchi, and Yoshio Suzuki, “Exercise Upregulates Salivary Amylase in Humans,” Experimental and Therapeutic Medicine, vol. 7, no. 4, pp. 773-777, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Matteo Zanetti et al., “Multilevel Assessment of Mental Stress Via Network Physiology Paradigm Using Consumer Wearable Devices,” Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 4, pp. 4409-4418, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Ahmed Ismail, Samir Abdlerazek, and Ibrahim M. El-Henawy, “Development of a Smart Healthcare System Based on Speech Recognition Using Support Vector Machine and Dynamic Time Warping,” Sustainability, vol. 12, no. 6, pp. 1-15, 2020. [CrossRef] [Google Scholar] [Publisher Link]
[14] Emna Rejaibi et al., “MFCC-Based Recurrent Neural Network for Automatic Clinical Depression Recognition and Assessment from Speech,” Biomedical Signal Processing and Control, vol. 71, pp. 1-14, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Alexandros Liapis et al., “Advancing Stress Detection Methodology With Deep Learning Techniques Targeting UX Evaluation in AAL Scenarios: Applying Embeddings for Categorical Variables,” Electronics, vol. 10, no. 13, pp. 1-13, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Serdar Yildirim, Yasin Kaya, and Fatih Kılıç, “A Modified Feature Selection Method Based on Metaheuristic Algorithms for Speech Emotion Recognition,” Applied Acoustics, vol. 173, pp. 1-13, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Adrián Vázquez-Romero, and Ascensión Gallardo-Antolín, “Automatic Detection of Depression in Speech Using Ensemble Convolutional Neural Networks,” Entropy, vol. 22, no. 6, pp. 1-17, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Sara Sardari et al., “Audio-Based Depression Detection Using Convolutional Autoencoder,” Expert Systems with Applications, vol. 189, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Lili Zhu, Petros Spachos, and Stefano Gregori, “Multi-Modal Physiological Signals and Machine Learning for Stress Detection by Wearable Devices,” In 2022 IEEE International Symposium on Medical Measurements and Applications (MeMeA), Messina, Italy, pp. 1-6, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Mohamed Abd Al-Alim et al., “A Machine-Learning Approach for Stress Detection Using Wearable Sensors in Free-Living Environments,” Computers in Biology and Medicine, vol. 179, pp. 1- 37, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Eman Abdelfattah, Shreehar Joshi, and Shreekar Tiwari, “Machine and Deep Learning Models for Stress Detection Using Multi-Modal Physiological Data,” IEEE Access, vol. 13, pp. 4597-4608, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Anusha Koduru, Hima Bindu Valiveti, and Anil Kumar Budati, “Feature Extraction Algorithms to Improve the Speech Emotion Recognition Rate,” International Journal of Speech Technology, vol. 23, no. 1, pp. 45-55, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Sudarsana Reddy Kadiri, RaviShankar Prasad, and Bayya Yegnanarayana, “Detection of Glottal Closure Instant and Glottal Open Region from Speech Signals Using Spectral Flatness Measure,” Speech Communication, vol. 116, pp. 30-43, 2020. [CrossRef] [Google Scholar] [Publisher Link]
[24] Anthony M Dart, Xiao-Jun Du, and Bronwyn A Kingwell, “Gender, Sex Hormones and Autonomic Nervous Control of the Cardiovascular System,”Cardiovascular Research, vol. 53, no. 3, pp. 678-687, 2002.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Eman Elsaeed et al., “Detecting Fake News in Social Media Using a Voting Classifier,” IEEE Access, vol. 9, pp. 161909-161925, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Apit Hemakom, Danita Atiwiwat, and Pasin Israsena, “ECG and EEG-Based Detection and Multilevel Classification of Stress Using Machine Learning for Specified Genders: A Preliminary Study,” PLOS One, vol. 18, no. 9, pp. 1-24, 2023. [CrossRef] [Google Scholar] [Publisher Link]
[27] Tatiur Rahman et al., “Mental Stress Recognition Using K-Nearest Neighbor (KNN) Classifier on EEG Signals,” International Conference on Materials, Electronics & Information Engineering, ICMEIE-2015, Faculty of Engineering, University of Rajshahi, Bangladesh, pp. 1-4, 2015.
[Google Scholar]
[28] Fatimah Alzamzami, Mohamad Hoda, and Abdulmotaleb El Saddik, “Light Gradient Boosting Machine for General Sentiment Classification on Short Texts: A Comparative Evaluation,” IEEE Access, vol. 8, pp. 101840-101858, 2020.
[CrossRef] [Google Scholar] [Publisher Link]