Image Retrieval Based on Tree Structures from Hierarchical Data

International Journal of Computer Science and Engineering
© 2024 by SSRG - IJCSE Journal
Volume 11 Issue 3
Year of Publication : 2024
Authors : Shihong Lu, Zhen Wang, Limeng Gao, Kaiyang Gong

pdf
How to Cite?

Shihong Lu, Zhen Wang, Limeng Gao, Kaiyang Gong, "Image Retrieval Based on Tree Structures from Hierarchical Data," SSRG International Journal of Computer Science and Engineering , vol. 11,  no. 3, pp. 1-8, 2024. Crossref, https://doi.org/10.14445/23488387/IJCSE-V11I3P101

Abstract:

With the advent of the mobile network era, the number of images has increased explosively. In the context of mobile internet, image retrieval plays an irreplaceable role in our lives. Due to the continuous development of deep learning algorithms, researchers have introduced deep learning technology into the field of image retrieval for the generation of image hashes. However, most image hash algorithms only consider sample category loss and treat the category distance between different labels equally, thus ignoring the distance information between categories. To address the above issues, this paper proposes an image retrieval algorithm based on the path distance between categories in the sample category hierarchical structure. The Swin Transformer network is used to extract image features, and a similarity distance matrix is generated through the tree-like structure of image categories. The distance between the generated hash codes in the hash layer is consistent with the similarity distance matrix. In the Hamming space, similar images are relatively close, and completely dissimilar images have the greatest difference in hash codes. The distance between the hash centers of each category achieves a quantization effect. Experimental results on public datasets show that the introduction of sample category hierarchical structure and similarity distance loss significantly improves the accuracy of image retrieval.

Keywords:

Image retrieval, Swin Transomfer, Similarity distance, Image hashing, Hamming space.

References:

[1] Homayoun Rastegar, and Davar Giveki, “Designing a New Deep Convolutional Neural Network for Content-Based Image Retrieval with Relevance Feedback,” Computers and Electrical Engineering, vol. 106, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Zahra Hossein-Nejad, and Mehdi Nasri, “An Adaptive Image Registration Method based on SIFT Features and RANSAC Transform,” Computers & Electrical Engineering, vol. 62, pp. 524-537, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Xiao Han et al., “SuperPointVO: A Lightweight Visual Odometry based on CNN Feature Extraction,” 5 th International Conference on Automation, Control and Robotics Engineering, Dalian, China, pp. 685-691, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Keiron O'Shea, and Ryan Nash, “An Introduction to Convolutional Neural Networks,” arXiv, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Antonia Creswell et al., “Generative Adversarial Networks: An Overview,” IEEE Signal Processing Magazine, vol. 35, no. 1, pp. 53-65, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Cesare Alippi, Simone Disabato, and Manuel Roveri, “Moving Convolutional Neural Networks to Embedded Systems: The Alexnet and VGG-16 Case,” 17th ACM/IEEE International Conference on Information Processing in Sensor Networks, Porto, Portugal, pp. 212-223, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Karen Simonyan, and Andrew Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Fengxiang He, Tongliang Liu, and Dacheng Tao, “Why Resnet Works? Residuals Generalize,” IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 12, pp. 5349-5362, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Ashish Vaswani et al., “Attention is All you Need,” Advances in Neural Information Processing Systems, vol. 30, 2017.
[Google Scholar] [Publisher Link]
[10] Alaaeldin El-Nouby et al., “Training Vision Transformers for Image Retrieval,” arXiv, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Rongkai Xia et al., “Supervised Hashing for Image Retrieval Via Image Representation Learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 28, no. 1, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Han Zhu et al., “Deep Hashing Network for Efficient Similarity Retrieval,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Li Yuan et al., “Central Similarity Quantization for Efficient Image and Video Retrieval,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3083-3092, 2020.
[Google Scholar] [Publisher Link]
[14] Ze Liu et al., “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012-10022, 2021.
[Google Scholar] [Publisher Link]
[15] Kevin Lin et al., “Deep Learning of Binary Hash Codes for Fast Image Retrieval,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 27-35, 2015.
[Google Scholar] [Publisher Link]
[16] Yining Wang et al., “A Theoretical Analysis of Normalized Discounted Cumulative Gain (NDCG) Type Ranking Measures,” Proceedings of the 26th Annual Conference on Learning Theory, vol. 30, pp. 25-54, 2013.
[Google Scholar] [Publisher Link]
[17] Mascagni Pietro et al., “Artificial Intelligence for Surgical Safety: Automatic Assessment of the Critical View of Safety in Laparoscopic Cholecystectomy using Deep Learning,” Annals of Surgery, vol. 275, no. 5, pp. 955-961, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Yunchao Gong et al. “Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 2916-2929, 2012.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Richeng Xuan, Junho Shim, and Sang-Goo Lee, “Deep Semantic Hashing Using Pairwise Labels,” IEEE Access, vol. 9, pp. 91934-91949, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Yue Cao et al., “Deep Cauchy Hashing for Hamming Space Retrieval,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1229-1237, 2018.
[Google Scholar] [Publisher Link]