Susceptibility Analysis Using Adversarial Attacks: A Deep Learning Perspective on Contextual Clinical Text

International Journal of Electrical and Electronics Engineering |
© 2025 by SSRG - IJEEE Journal |
Volume 12 Issue 8 |
Year of Publication : 2025 |
Authors : Jaya A. Zalte, Harshal Shah |
How to Cite?
Jaya A. Zalte, Harshal Shah, "Susceptibility Analysis Using Adversarial Attacks: A Deep Learning Perspective on Contextual Clinical Text," SSRG International Journal of Electrical and Electronics Engineering, vol. 12, no. 8, pp. 112-119, 2025. Crossref, https://doi.org/10.14445/23488379/IJEEE-V12I8P111
Abstract:
Deep learning models have demonstrated a strong performance in various classifications of varied amounts of data. As these models are prone to various attacks, even the smallest change can generate errors and lead to the classification of data. Adversarial attacks, which can significantly impact the model’s performance, pose a threat to these models. In this work, the vulnerability of deep learning models in clinical contextual text classification using adversarial perturbations is demonstrated. By applying the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD), evaluating the model robustness and data sensitivity, and were able to demonstrate the attacks with a decrease in accuracy drop of 23%. With white box attacks, trained a DistilBERT model and optimized the model accordingly to sustain the attacks. Our results demonstrate significant prediction shifts from minor input perturbations and suggest a new metric for calculating the susceptibility of the underlying text that generates a susceptible score. Further, the adversarially trained model can withstand FGSM and PGD attacks significantly.
Keywords:
Adversarial attacks, Deep learning, FGSM, PGD, Text classification.
References:
[1] Christian Janiesch, Patrick Zschech, and Kai Heinrich, “Machine Learning and Deep Learning,” Electronic Markets, vol. 31, no. 3, pp. 685-695, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Ian Goodfellow et al., “Generative Adversarial Networks,” Communications of the ACM, vol. 63, no. 11, pp. 139-144, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[3] William Villegas-Ch, Angel Jaramillo-Alcázar, and Sergio Luján-Mora, “Evaluating the Robustness of Deep Learning Models against Adversarial Attacks: An Analysis with FGSM, PGD and CW,” Big Data and Cognitive Computing, vol. 8, no. 1, pp. 1-23, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Sanjaykrishnan Ravikumar et al., “Securing AI of Healthcare: A Selective Review on Identifying and Preventing Adversarial Attacks,” 2024 IEEE Opportunity Research Scholars Symposium (ORSS), USA, pp. 75-78, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Sarfraz Brohi, and Qurat-ul-ain Mastoi, “From Accuracy to Vulnerability: Quantifying the Impact of Adversarial Perturbations on Healthcare AI Models,” Big Data and Cognitive Computing, vol. 9, no. 5, pp. 1-18, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Etidal Alruwaili, and Tarek Moulahi, “Prevention of Data Poisonous Threats on Machine Learning Models in e-Health,” ACM Transactions on Computing for Healthcare, pp. 1-20, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Sheikh Burhan Ul Haque et al., “Threats to Medical Diagnosis Systems: Analyzing Targeted Adversarial Attacks in Deep Learning-Based COVID-19 Diagnosis,” Soft Computing, vol. 29, no. 3, pp. 1879-1896, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Lu Sun, Mingtian Tan, and Zhe Zhou, “A Survey of Practical Adversarial Example Attacks,” Cybersecurity, vol. 1, no. 1, pp. 1-9, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Han Xu et al., “Adversarial Attacks and Defenses in Images, Graphs and Text: A Review,” International Journal of Automation and Computing, vol. 17, no. 2, pp. 151-178, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Guoqin Chang et al., “TextGuise: Adaptive Adversarial Example Attacks on Text Classification Model,” Neurocomputing, vol. 529, pp. 190-203, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Jincheng Xu, and Qingfeng Du, “TextTricker: Loss-Based and Gradient-Based Adversarial Attacks on Text Classification Models,” Engineering Applications of Artificial Intelligence, vol. 92, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Nicolas Papernot et al., “Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks,” 2016 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, pp. 582-597, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Wenqi Wang et al., “TextFirewall: Omni-Defending against Adversarial Texts in Sentiment Classification,” IEEE Access, vol. 9, pp. 27467-27475, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Xu Han et al., “Text Adversarial Attacks and Defenses: Issues, Taxonomy, and Perspectives,” Security and Communication Networks, vol. 2022, no. 1, pp. 1-25, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Melika Behjati et al., “Universal Adversarial Attacks on Text Classifiers,” ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, pp. 7345-7349, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Mohammadreza Qaraei, and Rohit Babbar, “Adversarial Examples for Extreme Multilabel Text Classification,” Machine Learning, vol. 111, no. 12, pp. 4539-4563, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Hyun Kwon, and Sanghyun Lee, “Detecting Textual Adversarial Examples through Text Modification on Text Classification Systems,” Applied Intelligence, vol. 53, no. 16, pp. 19161-19185, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Nina Fatehi, Qutaiba Alasad, and Mohammed Alawad, “Towards Adversarial Attacks for Clinical Document Classification,” Electronics, vol. 12, no. 1, pp. 1-20, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Kexin Zhao et al., “Intriguing Properties of Universal Adversarial Triggers for Text Classification,” 2024 5th International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE), Wenzhou, China, pp. 480-487, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Xinzhe Li et al., “Exploring the Vulnerability of Natural Language Processing Models via Universal Adversarial Texts,” Proceedings of the 19th Annual Workshop of the Australasian Language Technology Association, pp. 138-148, 2021.
[Google Scholar] [Publisher Link]
[21] Samuel G. Finlayson et al., “Adversarial Attacks on Medical Machine Learning,” Science, vol. 363, no. 6433, pp. 1287-1289, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Magdalini Paschali et al., “Generalizability vs. Robustness: Investigating Medical Imaging Networks using Adversarial Examples,” Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, Granada, Spain, pp. 493-501, 2018.
[CrossRef] [Google Scholar] [Publisher Link]