Enhancing Image Classification Performance through Hybrid Self-Supervised Learning Strategies

International Journal of Electronics and Communication Engineering |
© 2025 by SSRG - IJECE Journal |
Volume 12 Issue 7 |
Year of Publication : 2025 |
Authors : Deepa S, Sheetal, Alli A, Rashmi Siddalingappa |
How to Cite?
Deepa S, Sheetal, Alli A, Rashmi Siddalingappa, "Enhancing Image Classification Performance through Hybrid Self-Supervised Learning Strategies," SSRG International Journal of Electronics and Communication Engineering, vol. 12, no. 7, pp. 90-101, 2025. Crossref, https://doi.org/10.14445/23488549/IJECE-V12I7P108
Abstract:
Image classification is a cornerstone of computer vision, with the applications spanning healthcare, autonomous driving and security. The dependence on large labeled datasets for supervised learning poses significant challenges, particularly in specialized fields where the labeled data is scarce and expensive to obtain. Self-supervised learning (SSL) has emerged as a promising paradigm, enabling models to learn useful representations from unlabelled data by designing pretext tasks that generate pseudo-labels. SSL faces limitations in handling complex data distributions and achieving robust generalization. This paper explores hybrid self-supervised learning strategies that combine multiple SSL techniques, such as contrastive learning, masked image modeling, and clustering, to enhance image classification performance and reduce dependence on labeled data. This study proposes a comprehensive framework that integrates data augmentation, feature extraction, and hybrid learning mechanisms, evaluated on the CIFAR-100 dataset. The experimental results demonstrate that hybrid SSL approaches achieve significant improvements in performance. The combination of SimCLR and masked image modeling (MAE) achieves a Top-1 accuracy of 77.8% on the clean test set and 71.4% on the domain-shifted set, and self-distillation with contrastive learning (DINO) achieves the highest Top-1 accuracy of 78.4% on the clean test set and 72.1% on the domain-shifted set. Advanced data augmentation techniques, such as CutMix and RandAugment, additionally enhance model robustness, with SwAV (contrastive clustering) achieving 76.5% Top-1 accuracy on the clean test set and 70.1% on the domain-shifted set. The findings highlight the effectiveness of hybrid SSL methods in addressing the challenges of limited labelled data, offering valuable insights for future research and applications in image classification.
Keywords:
Image classification, Self-Supervised, Hybrid SSL, Computer Vision, Contrastive Clustering, Multi-Modal.
References:
[1] Longlong Jing, and Yingli Tian, “Self-Supervised Visual Feature Learning with Deep Neural Networks: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 11, pp. 4037-4058, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Mahmoud Assran et al., “Masked Siamese Networks for Label-Efficient Learning,” 17th European Conference Computer Vision ECCV, Tel Aviv, Israel, pp. 456-473, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Xinlei Chen et al., “Improved Baselines with Momentum Contrastive Learning,” Arxiv Preprint, pp. 1-3, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[4] I. Zeki Yalniz et al., “Billion-Scale Semi-Supervised Learning for Image Classification,” Arxiv Preprint, pp. 1-12, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Zhong-Yu Li et al., “Enhancing Representations Through Heterogeneous Self-Supervised Learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 7, pp. 5976-5989, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Jean-Bastien Grill et al., “Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 21271-21284, 2020.
[Google Scholar] [Publisher Link]
[7] Xinlei Chen, and Kaiming He, “Exploring Simple Siamese Representation Learning,” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, pp. 15750-15758, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Shisheng Deng et al., “Improving Few-Shot Image Classification with Self-Supervised Learning,” 15th International Conference, Held as Part of the Services Conference Federation, Honolulu, HI, USA, vol. 13731, pp. 54-68, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Mathilde Caron et al., “Unsupervised Learning of Visual Features by Contrasting Cluster Assignments,” Advances in Neural Information Processing Systems, vol. 33, pp. 9912-9924, 2020.
[Google Scholar] [Publisher Link]
[10] Mathilde Caron et al., “Emerging Properties in Self-Supervised Vision Transformers,” 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, pp. 9630-9640, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Ting Chen et al., “A Simple Framework for Contrastive Learning of Visual Representations,” Proceedings of Machine Learning Research, vol. 119, pp. 1597-1607, 2020.
[Google Scholar] [Publisher Link]
[12] Zong Fan et al., “Self-Supervised Learning Based on StyleGAN for Medical Image Classification on Small Labeled Dataset,” Proceedings Medical Imaging 2024: Image Processing, vol. 1292630, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Xiaogi Fang et al., “A Hybrid Self-Supervised Learning Framework for Hyperspectral Image Classification,” Proceedings of the 2023 International Conference on Computer, Vision and Intelligent Technology, pp. 1-7, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[14] H.A. Haseela, “Hybrid Method for Image Classification,” EPRA International Journal of Research & Development (IJRD), vol. 7, no. 2, pp. 59-61, 2022.
[CrossRef] [Publisher Link]