Scaling Graph Partitioning Techniques for Information Retrieval: A Methodological Exploration

International Journal of Electronics and Communication Engineering |
© 2025 by SSRG - IJECE Journal |
Volume 12 Issue 4 |
Year of Publication : 2025 |
Authors : Ibrahim Atoum, Tarek Kanan, Maraw Fayiz Hamza |
How to Cite?
Ibrahim Atoum, Tarek Kanan, Maraw Fayiz Hamza, "Scaling Graph Partitioning Techniques for Information Retrieval: A Methodological Exploration," SSRG International Journal of Electronics and Communication Engineering, vol. 12, no. 4, pp. 119-131, 2025. Crossref, https://doi.org/10.14445/23488549/IJECE-V12I4P111
Abstract:
This article examines scaling graph partitioning techniques through traditional and Machine Learning (ML) approaches while discussing challenges and solutions. Effective partitioning of large Information Retrieval (IR) datasets requires examining critical factors, including k selection choices alongside eigenvector selection and initial strategies while managing data uncertainty. The research strongly emphasizes integrating ML to allow systems to dynamically adapt and improve indexing, query processing, and clustering within large document collections and knowledge graphs. The article examines multiple methods based on their performance with large graphs, their community detection capabilities, parallelization convenience, and the flexibility derived from ML. Through Graph Neural Networks (GNNs) and Reinforcement Learning (RL), ML optimizes partitions by learning from evolving relationships and retrieving performance feedback. The research addresses conventional issues, including computational complexity and workload prediction, while examining ML limitations, which depend on labeled data and face interpretability concerns. The analysis covers several mitigation strategies that boost scalability, adaptive learning systems, and online learning approaches. The examination highlights the essential function of Graph Partitioning (GP) for IR system improvement and the growing influence of ML in this area. The discussion examines applications like social network analysis and fraud detection while exploring the potential advancements in dynamic GP for IR and the expected expansion of ML-based solutions. Through the discussions, it becomes clear that hybrid techniques combining ML methods with traditional approaches lead to better partitioning performance, which creates more resilient and scalable IR systems.
Keywords:
Scaling, Graph partitioning, Machine learning, Information retrieval, Dynamic graph partitioning.
References:
[1] Venkat N. Gudivada, Dhana L. Rao, and Amogh R. Gudivada, Information Retrieval: Concepts, Models, and Systems, Handbook of Statistics, vol. 38, pp. 331-401, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[2] V. Gupta, D.K. Sharma, and A. Dixit, “Review of Information Retrieval: Models, Performance Evaluation Techniques and Applications,” International Journal of Sensors Wireless Communications and Control, vol. 11, no. 9, pp. 896-909, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[3] José Devezas, and Sérgio Nunes, “A Review of Graph-Based Models for Entity-Oriented Search,” SN Computer Science, vol. 2, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Jiakang Li et al., “A Comprehensive Review of Community Detection in Graphs,” Neurocomputing, vol. 600, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Siddhartha Sahu et al., “The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing: Extended Survey,” The VLDB Journal, vol. 29, pp. 595-618, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Bo-young Lim et al., “Multi-Level Graph Representation Learning Through Predictive Community-Based Partitioning,” Proceedings of the ACM on Management of Data, vol. 3, no. 1, pp. 1-27, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Qi Wang, Kenneth H. Lai, and Chunlei Tang, “Solving Combinatorial Optimization Problems Over Graphs with BERT-Based Deep Reinforcement Learning,” Information Sciences, vol. 619, pp. 930-946, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Rong Chen et al., “Powerlyra: Differentiated Graph Computation and Partitioning on Skewed Graphs,” ACM Transactions on Parallel Computing, vol. 5, no. 3, pp. 1-39, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Zhanzhe Li et al., “MDL: Maximum Density Label-Cut Graph Partitioning,” 10th International Conference on Big Data and Information Analytics (BigDIA), Chiang Mai, Thailand, pp. 125-132, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Rafael M.S. Siqueira et al., “Graph Partitioning Algorithms: A Comparative Study,” International Conference on Information Technology-New Generations, pp. 513-520, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Sivakumar Ponnusamy, and Pankaj Gupta, “Scalable Data Partitioning Techniques for Distributed Data Processing in Cloud Environments: A Review,” IEEE Access, vol. 12, pp. 26735-26746, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Tewodros Alemu Ayall et al., “Graph Computing Systems and Partitioning Techniques: A Survey,” IEEE Access, vol. 10, pp. 118523-118550, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Christian Mayer, “Scalable Graph Partitioning for Distributed Graph Processing,” Institute for Parallel and Distributed Systems (IPVS), University of Stuttgart, pp. 1-161, 2019.
[Google Scholar] [Publisher Link]
[14] Anil Pacaci, and M. Tamer Özsu, “Experimental Analysis of Streaming Algorithms for Graph Partitioning,” Proceedings of the 2019 International Conference on Management of Data, pp. 1375-1392, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Charles-Edmond Bichot, and Patrick Siarry, Graph Partitioning, John Wiley & Sons, 2013.
[Google Scholar] [Publisher Link]
[16] Reinhard Diestel, Graph Theory, Springer Berlin, Heidelberg, pp. 1-455, 2025.
[CrossRef] [Publisher Link]
[17] Ravikant Diwakar et al., “Optimizing Load Distribution in Big Data Ecosystems: A Comprehensive Survey,” AI and the Revival of Big Data, pp. 177-200, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Ümit Çatalyürek et al., “More Recent Advances in (hyper) Graph Partitioning,” ACM Computing Surveys, vol. 55, no. 12, pp. 1-38, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[19] J.T. Yan, “Fuzzy-Based Balanced Partitioning Under Capacity and Size-Tolerance Constraints in Distributed Quantum Circuits,” IEEE Transactions on Quantum Engineering, vol. 4, pp. 1-15, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Diego Recalde et al., “An Exact Approach for the Balanced K-Way Partitioning Problem with Weight Constraints and its Application to Sports Team Realignment,” Journal of Combinatorial Optimization, vol. 36, pp. 916-936, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Lingkai Meng et al., “A Survey of Distributed Graph Algorithms on Massive Graphs,” ACM Computing Surveys, vol. 57, no. 2, pp. 1-39, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Hamid Hadian, and Mohsen Sharifi, “GT-Scheduler: A Hybrid Graph-Partitioning and Tabu-Search Based Task Scheduler for Distributed Data Stream Processing Systems,” Cluster Computing, vol. 27, pp. 5815-5832, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Kezhao Huang et al., “WiseGraph: Optimizing GNN with Joint Workload Partition of Graph and Operations,” Proceedings of the Nineteenth European Conference on Computer Systems, Athens, Greece pp. 1-17, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Christy Alex Panicker, and M. Geetha, “Exploring Graph Partitioning Techniques for GNN Processing on Large Graphs: A Survey,” 4th International Conference on Communication, Computing and Industry 6.0 (C216), Bangalore, India, pp. 1-7, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Daniela Dapena, Daniel L. Lau, and Gonzalo R. Arce, “Parallel Graph Signal Processing: Sampling and Reconstruction,” IEEE Transactions on Signal and Information Processing Over Networks, vol. 9, pp. 190-206, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[26] T. Heuer, “Scalable High-Quality Graph and Hypergraph Partitioning,” Doctoral Thesis, Department of Informatics of the Karlsruhe Institute of Technology, pp. 1-254, 2022.
[Google Scholar] [Publisher Link]
[27] George Karypis et al., “Recent Trends in Graph Decomposition,” Dagstuhl Reports, vol. 13, no. 8, pp. 1-45, 2024.
[Google Scholar] [Publisher Link]
[28] Lars Gottesbüren et al., “Deep Multilevel Graph Partitioning,” Arxiv, pp. 1-19, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Alan Valejo et al., “A Critical Survey of the Multilevel Method in Complex Networks,” ACM Computing Surveys (CSUR), vol. 53, no. 2, pp. 1-35, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Aman Garg et al., “A Review on Artificial Intelligence-Enabled Mechanical Analysis of 3D Printed and FEM-Modelled Auxetic Metamaterials,” Virtual and Physical Prototyping, vol. 20, no. 1, pp. 1-50, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Wan Luan Lee et al., “HyperG: Multilevel GPU-Accelerated K-Way Hypergraph Partitioner,” Proceedings of the 30th Asia and South Pacific Design Automation Conference, Tokyo, Japan, pp. 1031-1040, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Sheng Xiang et al., “Scalable Learning-Based Community-Preserving Graph Generation,” IEEE Transactions on Big Data, pp. 1-14, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Mohammad Amiriebrahimabadi, Zhina Rouhi, and Najme Mansouri, “A Comprehensive Survey of Multi-Level Thresholding Segmentation Methods for Image Processing,” Archives of Computational Methods in Engineering, vol. 31, pp. 3647-3697, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Ling Dingcet al., “Survey of Spectral Clustering Based on Graph Theory,” Pattern Recognition, vol. 51, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Ji Wang et al., “Scalable Spectral Clustering with Group Fairness Constraints,” Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, pp. 6613-6629, 2023.
[Google Scholar] [Publisher Link]
[36] Brahim Laassem et al., “A Spectral Method to Detect Community Structure Based on Coulomb’s Matrix,” Social Network Analysis and Mining, vol. 13, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[37] Swati A. Bhavsar, Varsha H. Patil, and Aboli H. Patil, “Graph Partitioning and Visualization in Graph Mining: A Survey,” Multimedia Tools and Applications, vol. 81, pp. 43315-43356, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Joel Mackenzie, Matthias Petri, and Alistair Moffat, “Tradeoff Options for Bipartite Graph Partitioning,” IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 8, pp. 8644-8657, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Chao-Wei Ou, and S. Ranka, “Parallel Incremental Graph Partitioning,” IEEE Transactions on Parallel and Distributed Systems, vol. 8, no. 8, pp. 884-896, 1997.
[CrossRef] [Google Scholar] [Publisher Link]
[40] Douglas O. Cardosoet al., “Greedy Recursive Spectral Bisection for Modularity-Bound Hierarchical Divisive Community Detection,” Statistics and Computing, vol. 34, pp. 1-18, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Teague Tomesh et al., “Quantum Divide and Conquer for Combinatorial Optimization and Distributed Computing,” arXiv, pp. 1-12, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Sung-eui Yoon et al., Real-Time Massive Model Rendering, Springer International Publishing, pp. 1-112, 2022.
[Google Scholar] [Publisher Link]
[43] D. Slavchev, S. Margenov, and I.G. Georgiev “On the Application of Recursive Bisection and Nested Dissection Reorderings for Solving Fractional Diffusion Problems Using HSS Compression,” AIP Conference Proceedings, vol. 2302, no. 1, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Rajit Nair, and Amit Bhagat, An Introduction to Clustering Algorithms in Big Data, Encyclopedia of Information Science and Technology, 5th ed., pp. 559-576, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[45] Haixia Wu et al., “Link Prediction on Complex Networks: An Experimental Survey,” Data Science and Engineering, vol. 7, pp. 253-278, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[46] Pierre Brémaud, Probability Theory and Stochastic Processes, Springer Cham, pp. 1-713, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[47] Abdul Samad, “Enhancing Community Detection and Data Clustering in Weighted Graphs using Gumbel Softmax-Based Approaches,” 2024.
[Google Scholar]
[48] Sifeng Bi et al., “Stochastic Model Updating with Uncertainty Quantification: An Overview and Tutorial,” Mechanical Systems and Signal Processing, vol. 204, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[49] James Alexander Scott, “MCMC Methods: Graph Samplers, Invariance Tests and Epidemic Models,” Doctoral Thesis, Imperial College London, pp. 1-186, 2023.
[Google Scholar] [Publisher Link]
[50] Zehuan Hu et al., “Self-Learning Dynamic Graph Neural Network with Self-Attention Based on Historical Data and Future Data for Multi-Task Multivariate Residential Air Conditioning Forecasting,” Applied Energy, vol. 364, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[51] Dilong Li et al., “Graph Neural Networks in Point Clouds: A Survey,” Remote Sensing, vol. 16, no. 14, pp. 1-44, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[52] Mohamed Massaoudi et al., “Advancing Coherent Power Grid Partitioning: A Review Embracing Machine and Deep Learning,” IEEE Open Access Journal of Power and Energy, vol. 12, pp. 59-75, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[53] Jianwu Long, and Luping Liu, “K*-Means: An Efficient Clustering Algorithm with Adaptive Decision Boundaries,” International Journal of Parallel Programming, vol. 53, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[54] Harley Wiltzer et al., “Foundations of Multivariate Distributional Reinforcement Learning,” Advances in Neural Information Processing Systems, vol. 37, pp. 101297-101336, 2025.
[Google Scholar] [Publisher Link]
[55] Mahmoud E. Farfoura et al., “A Novel Lightweight Machine Learning Framework for IoT Malware Classification Based on Matrix Block Mean Downsampling,” Ain Shams Engineering Journal, vol. 16, no. 1, pp. 1-11, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[56] Mohammad Abdallah et al., “An E-Learning Portal Quality Model: From Al-Zaytoonah University Students’ Perspective,” International Conference on Information Technology, Amman, Jordan, pp. 553-557, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[57] Ahmad A.A. Alkhatib, and Khalid Mohammad Jaber, “FDPA Internet of Things System for Forest Fire Detection, Prediction, and Behavior Analysis,” IET Wireless Sensor Systems, vol. 14, no. 3, pp. 47-83, 2024.
[CrossRef] [Google Scholar] [Publisher Link]