Performance Analysis of Naïve Bayes and Stochastic Gradient Descent-based SVM for Sentiment Analysis

International Journal of Computer Science and Engineering
© 2025 by SSRG - IJCSE Journal
Volume 12 Issue 5
Year of Publication : 2025
Authors : Shrushti Jeetendra Vadher

pdf
How to Cite?

Shrushti Jeetendra Vadher, "Performance Analysis of Naïve Bayes and Stochastic Gradient Descent-based SVM for Sentiment Analysis," SSRG International Journal of Computer Science and Engineering , vol. 12,  no. 5, pp. 10-18, 2025. Crossref, https://doi.org/10.14445/23488387/IJCSE-V12I5P102

Abstract:

This paper explores the effectiveness of sentiment classification on food review dataset, mainly focusing on Naïve bayes and Stochastic Gradient Descent-based Support Vector Machine (SGD-based SVM). The findings highlight the performance of both models and the impact of data preprocessing methods, as sentiment analysis is necessary for natural language processing and business customer services. The dataset obtained from Amazon on food review underwent preprocessing using Term Frequency-Inverse Document Frequency (TF-1DF) vectorization to transform textual data into numerical representations and Synthetic Minority Oversampling Technique to rectify class imbalances, ensuring fair and robust evaluation. The evaluation metrics demonstrate that SGD-based SVM has performed better than Naïve bayes with 84% and 79.9% accuracy, respectively. It can be observed that the SGD-based performs better as compared to the naïve bayes model.

Keywords:

Class imbalance, Naïve bayes, Natural language processing, Sentiment analysis, Support Vector Machine.

References:

[1] Honglie Zhang, “Model Comparison in Sentiment Analysis: A Case Study of Amazon Product Reviews,” Highlights in Science Engineering and Technology, vol. 16, pp. 23-31, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Mohammad Mohaiminul Islam, and Naznin Sultana, “Comparative Study on Machine Learning Algorithms for Sentiment Classification,” International Journal of Computer Applications, vol. 182, no. 21, pp. 1-7, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Stuti M. Meshram, and Neeraj Sahu, “Sentiment Analysis of E-commerce Product Review through Machine Learning,” International Journal for Multidisciplinary Research, vol. 5, no. 3, pp. 1-5, 2023.
[CrossRef] [Publisher Link]
[4] Karthick Prasad Gunasekaran, “Exploring Sentiment Analysis Techniques in Natural Language Processing: A Comprehensive Review,” arXiv, pp. 1-6, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Antony Samuels, and John Mcgonical, “Sentiment Analysis on Customer Responses,” arXiv, pp. 1-3, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Maite Taboada, “Sentiment Analysis: An Overview from Linguistics,” Annual Review of Linguistics, vol. 2, pp. 325-347, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Jamin Rahman Jim et al., “Recent Advancements and Challenges of NLP-based Sentiment Analysis: A State-of-the-art Review,” Natural Language Processing Journal, vol. 6, pp. 1-30, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Nishit Shrestha, and Fatma Nasoz, “Deep Learning Sentiment Analysis of Amazon.com Reviews and Ratings,” arXiv, pp. 1-15, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Kuat Yessenov, and Sasa Misailovic, “Sentiment Analysis of Movie Review Comments,” Methodology, vol. 17, no. 17, pp. 1-7, 2009.
[Google Scholar] [Publisher Link]
[10] Mika V. Mäntylä, Daniel Graziotin, and Miikka Kuutila, “The Evolution of Sentiment Analysis - A Review of Research Topics, Venues, and Top Cited Papers,” Computer Science Review, vol. 27, pp. 16-32, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Shadi Diab, “Optimizing Stochastic Gradient Descent in Text Classification Based on Fine-Tuning Hyper-Parameters Approach. A Case Study on Automatic Classification of Global Terrorist Attacks,” International Journal of Computer Science and Information Security (IJCSIS), vol. 16, no. 12, pp. 155-160, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Earl E. Reber, Richard L. Michell, and Clarence J. Carter, “Oxygen Absorption in the Earth’s Atmosphere,” Aerospace Corp El Segundo Calif Los Operatios, pp. 1-102, 1968.
[Publisher Link]
[13] Kaustubh Yadav, “A Comprehensive Study on Optimization Strategies for Gradient Descent in Deep Learning,” arXiv, pp. 1-12, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Subin Sahayam, John Zakkam, and Umarani Jayaraman, “Can we Learn Better with Hard Samples?,” arXiv, pp. 1-10, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Diederik P. Kingma, and Jimmy Ba, “Adam: A Method for Stochastic Optimization,” arXiv, pp. 1-15, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Jayanth Koushik, and Hiroaki Hayashi, “Improving Stochastic Gradient Descent with Feedback,” Open Review.net, pp. 1-9, 2016.
[Google Scholar] [Publisher Link]
[17] Kai Zhang et al., “Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142-3155, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Thorsten Joachims, “Text Categorization with Support Vector Machines: Learning with Many Relevant Features,” Machine Learning: ECML-98, pp. 137-142, 1998.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Liliya A. Demidova, “Two-Stage Hybrid Data Classifiers Based on SVM and kNN Algorithms,” Symmetry, vol. 13, no. 4, pp. 1-32, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[20] David L. de Souza et al., “Fault Detection and Diagnosis Using Support Vector Machines - A SVC and SVR Comparison,” Journal of Safety Engineering, vol. 3, no. 1, pp. 18-29, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Hwanjo Yu, and Sungchul Kim, “SVM Tutorial - Classification, Regression and Ranking,” Handbook of Natural Computing, pp. 479 506, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Hilman Wisnu, Muhammad Afif, and Yova Ruldevyani, “Sentiment Analysis on Customer Satisfaction of Digital Payment in Indonesia: A Comparative Study using KNN and Naïve Bayes,” Journal of Physics Conference Series, vol. 1444, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[23] S.G. Fitri et al., “Naïve Bayes Classifier Models for Cerebral Infarction Classification,” Journal of Physics Conference Series, vol. 1490, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Kiran Bolaj, “Text Categorization System for English Text Documents using Naïve Bayes Classifier,” International Journal of Computer Applications, vol. 177, no. 48, pp. 1-4, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Sebastian Raschka, “Naive Bayes and Text Classification I - Introduction and Theory,” arXiv, pp. 1-20, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Qijia Wei, “Understanding of the Naive Bayes Classifier in Spam Filtering,” AIP Conference Proceedings, vol. 1967, no. 1, pp. 1-8, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Rimsha Rafique et al., “Deep Fake Detection and Classification using Error-level Analysis and Deep Learning,” Scientific Reports, vol. 13, pp. 1-13, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Marina Sokolova, and Guy Lapalme, “A Systematic Analysis of Performance Measures for Classification Tasks,” Information Processing & Management, vol. 45, no. 4, pp. 427-437, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Kai Ming Ting, “Confusion Matrix,” Encyclopedia of Machine Learning and Data Mining, 2017.
[CrossRef] [Publisher Link]
[30] David M.W. Powers, “What the F-measure doesn’t Measure: Features, Flaws, Fallacies and Fixes,” arXiv, pp. 1-19, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Hery Iswanto et al., “Comparison of Algorithms on Machine Learning for Spam Email Classification,” International Journal of Information System & Technology, vol. 5, no. 4, pp. 1-10, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Robert E. Karlsen, David J. Gorsich, and Grant R. Gerhart, “Target Classification via Support Vector Machines,” Optical Engineering, vol. 39, no. 3, 2000.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Muhammad Haroon et al., “Sentiment Analysis of Customer Reviews on E-commerce Platforms: A Machine Learning Approach,” Bulletin of Business and Economics (BBE), vol. 13, no. 3, pp. 230-238, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Emma Haddi, Xiaohui Liu, and Yong Shi, “The Role of Text Pre-processing in Sentiment Analysis,” Procedia Computer Science, vol. 17, pp. 26-32, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Ayal Taitler et al., “The 2023 International Planning Competition,” AI Magazine, vol. 45, no. 2, pp. 280-296, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[36] Alexandra Balahur et al., “Sentiment Analysis in the News,” arXiv, pp. 1-5, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Logan Ashbaugh, and Yan Zhang, “A Comparative Study of Sentiment Analysis on Customer Reviews Using Machine Learning and Deep Learning,” Computers, vol. 13, no. 12, pp. 1-16, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Gilad Mishne, and Natalie Glance, “Predicting Movie Sales from Blogger Sentiment,” AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, pp. 1-4, 2006.
[Google Scholar] [Publisher Link]
[39] Md Mahbubur Rahman, and Shaila Shova, “Emotion Detection from Social Media Posts” arXiv, pp. 1-9, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Vivek Narayanan, Ishan Arora, and Arjun Bhatia, “Fast and Accurate Sentiment Classification Using an Enhanced Naive Bayes Model,” Intelligent Data Engineering and Automated Learning-IDEAL 2013, pp. 194-201, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Aman Sawarn, Ankit, and Monika Gupta, “Comparative Analysis of Bagging and Boosting Algorithms for Sentiment Analysis,” Procedia Computer Science, vol. 173, pp. 210-215, 2020.
[CrossRef] [Google Scholar] [Publisher Link]