Automated Detection of Lower Back Pain Using Machine Learning and SMOTE-Based Data Augmentation

International Journal of Electronics and Communication Engineering
© 2025 by SSRG - IJECE Journal
Volume 12 Issue 5
Year of Publication : 2025
Authors : P. Praveen, Mallikarjunaswamy M.S, Mahabaleshwar Mamadapur
pdf
How to Cite?

P. Praveen, Mallikarjunaswamy M.S, Mahabaleshwar Mamadapur, "Automated Detection of Lower Back Pain Using Machine Learning and SMOTE-Based Data Augmentation," SSRG International Journal of Electronics and Communication Engineering, vol. 12,  no. 5, pp. 350-362, 2025. Crossref, https://doi.org/10.14445/23488549/IJECE-V12I5P129

Abstract:

Low Back Pain (LBP) is a leading global health concern, affecting up to 80% of individuals at some point in their lives and ranking among the most common causes of chronic disability and work absenteeism. Despite advancements in treatment, accurate and scalable diagnostic tools remain limited. Traditional diagnostic methods rely heavily on clinical expertise and imaging, which are often time-consuming, subjective, and inaccessible in resource-limited settings. Recent literature underscores the potential of Machine Learning (ML) for automating LBP detection, but challenges such as imbalanced datasets and insufficient model generalizability persist. This study introduces a robust ML pipeline for automatic LBP classification using data from the publicly available international dataset - Kaggle. The workflow incorporates data type normalization, outlier elimination, and feature distribution analysis, followed by class rebalancing through the Synthetic Minority Oversampling Technique (SMOTE). Three ML classifiers-Decision Tree (DT), Support Vector Machine (SVM), and Artificial Neural Network (ANN)-are trained and evaluated on both imbalanced and SMOTE-balanced datasets. Experimental results demonstrate a significant boost in classification performance post-balancing, with the ANN model achieving the highest accuracy (96.43%) and F1-score (96.47%). This work confirms that integrating effective preprocessing with optimized model selection can deliver accurate, scalable, and automated LBP detection-offering a meaningful step toward smarter musculoskeletal diagnostics.

Keywords:

Artificial Neural Network, Imbalanced data, Lower Back Pain, Machine Learning, SMOTE, Outliers.

References:

[1] D. Hoy et al., “The Global Burden of Low Back Pain: Estimates from the Global Burden of Disease 2010 Study,” Annals of the Rheumatic Diseases, vol. 73, no. 6, pp. 968-974, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Health, United States, National Center for Health Statistics, U.S. Department of Health and Human Services, 2020.
[Google Scholar] [Publisher Link]
[3] Gary S. Firestein, and Gary Koretzky, Firestein & Kelley’s Textbook of Rheumatology, 12th ed., Elsevier, pp. 740-763, 2024.
[Google Scholar] [Publisher Link]
[4] G.B. Andersson, “Epidemiological Features of Chronic Low-Back Pain,” The Lancet, vol. 354, no. 9178, pp. 581-585, 1999.
[Google Scholar] [Publisher Link]
[5] Michael T. Modic, and Jeffrey S. Ross, “Lumbar Degenerative Disk Disease,” Radiology, vol. 245, no. 1, pp. 43-61, 2007.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Amir Jamaludin, Timor Kadir, and Andrew Zisserman, “SpineNet: Automated Classification and Evidence Visualization in Spinal MRIs,” Medical Image Analysis, vol. 41, pp. 63-73, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Joachim Oertel et al., “Acute Low Back Pain: Epidemiology, Etiology, and Prevention: WFNS Spine Committee Recommendations,” World Neurosurgery: X, vol. 22, pp. 1-9, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Pradeep Suri et al., “Does Lumbar Spinal Degeneration begin with the Anterior Structures? A Study of the Observed Epidemiology in a Community-Based Population,” BMC Musculoskeletal Disorders, vol. 12, pp. 1-7, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Katri Koivisto et al., “Efficacy of Zoledronic Acid for Chronic Low Back Pain Associated with Modic Changes in Magnetic Resonance Imaging,” BMC Musculoskeletal Disorders, vol. 15, no. 1, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[10] N.V. Chawla et al., “SMOTE: Synthetic Minority Over-sampling Technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Surani Matharaarachchi, Mike Domaratzki, and Saman Muthukumarana, “Enhancing SMOTE for Imbalanced Data with Abnormal Minority Instances,” Machine Learning with Applications, vol. 18, pp. 1-31, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Alberto Fernandez et al., “SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary,” Journal of Artificial Intelligence Research, vol. 61, pp. 863-905, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Richard Kijowski et al., “Deep Learning for Lesion Detection, Progression, and Prediction of Musculoskeletal Disease,” Journal of Magnetic Resonance Imaging, vol. 52, no. 6, pp. 1607-1619, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Lower Back Pain Symptoms Dataset, Kaggle. [Online]. Available: https://www.kaggle.com/datasets/sammy123/lower-back-pain-symptoms-dataset?datasetId=107&sortBy=voteCount
[15] Shashwati Mishra, and Mrutyunjaya Panda, “A Histogram-Based Classification of Image Database Using Scale-Invariant Features,” International Journal of Image, Graphics and Signal Processing, vol. 9, no. 6, pp. 55-64, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[16] K. Andrea, G.L. Shevlyakov, and P.O. Smirnov, “Detection of Outliers with Boxplots,” Belarusian State University, Minsk, Belarus, Thesis, pp. 141-144, 2013.
[Google Scholar] [Publisher Link]
[17] David F. Williamson, Robert A. Parker, and Juliette S. Kendrick, “The Box Plot: A Simple Visual Method to Interpret Data,” Annals of Internal Medicine, vol. 110, no. 11, pp. 916-921, 1989.
[CrossRef] [Google Scholar] [Publisher Link]
[18] H. He, and E.A. Garcia, “Learning from Imbalanced Data,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263-1284, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Rok Blagus, and Lara Lusa, “SMOTE for High-Dimensional Class-Imbalanced Data,” BMC Bioinformatics, vol. 14, pp. 1-16, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Gary Stein et al., “Decision Tree Classifier for Network Intrusion Detection with GA-based Feature Selection,” Proceedings of the 43rd Annual Southeast Regional Conference, vol. 2, pp. 136-141, 2005.
[CrossRef] [Google Scholar] [Publisher Link]
[21] P.H. Swain, and H. Hauska, “The Decision Tree Classifier: Design and Potential,” IEEE Transactions on Geoscience Electronics, vol. 15, no. 3, pp. 142-147, 1977.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Corinna Cortes, and Vladimir Vapnik, “Support-Vector Networks,” Machine Learning, vol. 20, pp. 273-297, 1995.
[CrossRef] [Google Scholar] [Publisher Link]
[23] J.J. Hopfield, “Artificial Neural Networks,” IEEE Circuits and Devices Magazine, vol. 4, no. 5, pp. 3-10, 1988.
[CrossRef] [Google Scholar] [Publisher Link]
[24] G. Bebis, and M. Georgiopoulos, “Feed-Forward Neural Networks,” IEEE Potentials, vol. 13, no. 4, pp. 27-31, 1994.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Kit Yan Chan et al., “Deep Neural Networks in the Cloud: Review, Applications, Challenges and Research Directions,” Neurocomputing, vol. 545, pp. 1-24, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Jae-Geum Shim et al., “Machine Learning Approaches to Predict Chronic Lower Back Pain in People Aged Over 50 Years,” Medicina, vol. 57, no. 11, pp. 1-9, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Bernard X.W. Liew et al., “Interpretable Machine Learning Models for Classifying Low Back Pain Status Using Electromyographic and Kinematic Data,” European Spine Journal, vol. 29, pp. 1845-1859, 2020.
[CrossRef] [Google Scholar] [Publisher Link]