Hadoop Mapreduce-Based Multi-Modal Data Clustering Using CGWQRLO With Spatio-Temporal Analysis Using RBKDE
| International Journal of Electronics and Communication Engineering |
| © 2025 by SSRG - IJECE Journal |
| Volume 12 Issue 12 |
| Year of Publication : 2025 |
| Authors : Shailendra Singh Yadav, Kamal Sutaria |
How to Cite?
Shailendra Singh Yadav, Kamal Sutaria, "Hadoop Mapreduce-Based Multi-Modal Data Clustering Using CGWQRLO With Spatio-Temporal Analysis Using RBKDE," SSRG International Journal of Electronics and Communication Engineering, vol. 12, no. 12, pp. 98-113, 2025. Crossref, https://doi.org/10.14445/23488549/IJECE-V12I12P109
Abstract:
Data analysis has been significantly affected by the exponential growth of digital information on the internet in today’s digital era, thus leading to the development of Data Clustering (DC) techniques. Yet, none of the prevailing DC approaches captured the temporal and spatial dynamics of the data to produce coherent and temporally/spatially consistent clusters. Therefore, the proposed framework introduces the Hadoop MapReduce-based multi-modal DC methodology using Circle Grey Wolf Quasi-Reflexive Learning Optimizer (CGWQ-RLO) with spatial-temporal analysis using Radial Basis Kernel Density Estimation (RBKDE) for effective clustering. Primarily, the data containing numeric values and images is collected. Next, regarding Missing Value Imputation (MVI) and Normalization, the numeric values are preprocessed. Similarly, regarding noise removal, contrast enhancement, and edge preservation, the images are preprocessed. Now, the dimensionality of the preprocessed numeric and image data is reduced using the Principal Fuzzy Correlated Component Analysis (PF-CCA), followed by feature extraction. Next, the features are aggregated by the computation of Weighted Average (WA). Then, to perform Hadoop MapReduce, the Round Robin Heap (RRH) is employed, and RBKDE is applied for spatio-temporal analysis. Further, to cluster the data, the CGWQ-RLO is applied. Lastly, to optimize the number of clusters, the SSA is done on the clustered data. Therefore, when analogized to the prevailing techniques, the proposed framework clustered the multi-modal data more efficiently by attaining higher accuracy (97.79%).
Keywords:
Multi-Modal Data Clustering, Spatio-Temporal Analysis, Big data, Radial Basis Kernel Density Estimation (RBKDE), Circle Grey Wolf Quasi-Reflexive Learning Optimizer (CGWQ-RLO), Round Robin Heap (RRH), Weighted Covariance K-Nearest Neighbors (WCKNN), and Principal Fuzzy Correlated Component Analysis (PF-CCA).
References:
[1] Ammar Kamal Abasi et al., “An Improved Text Feature Selection for Clustering Using Binary Grey Wolf Optimizer,” Proceedings of the 11th National Technical Seminar on Unmanned System Technology, Kuala Lumpur, Malaysia, pp. 503-516, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Arvinder Kaur, Yugal Kumar, and Jagpreet Sidhu, “Exploring Meta-Heuristics for Partitional Clustering: Methods, Metrics, Datasets, and Challenges,” Artificial Intelligence Review, vol. 57, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Chou-Yuan Lee, Wei Wang, and Jian-Qiong Huang, “Clustering and Classification for Dry Bean Feature Imbalanced Data,” Scientific Reports, vol. 14, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Gbeminiyi John Oyewole, and George Alex Thopil, “Data Clustering: Application and Trends,” Artificial Intelligence Review, vol. 56, pp. 6439-6475, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Grzegorz Mrukwa, and Joanna Polanska, “DiviK: Divisive Intelligent K-means for Hands-Free Unsupervised Clustering in big Biological Data,” BMC Bioinformatics, vol. 23, pp. 1-24, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Haitao Ding, Chu Sun, and Jianqiu Zeng, “Fuzzy Weighted Clustering Method for Numerical Attributes of Communication Big Data Based on Cloud Computing,” Symmetry, vol. 12, no. 4, pp. 1-14, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Papia Ray, S. Surender Reddy, and Tuhina Banerjee, “Various Dimension Reduction Techniques for High Dimensional Data Analysis: A Review,” Artificial Intelligence Review, vol. 54, pp. 3473-3515, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Parul Agarwal, Shikha Mehta, and Ajith Abraham, “A Meta-Heuristic Density-Based Subspace Clustering Algorithm for High Dimensional Data,” Research Square, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Seyyed Mohammad Razavi, Mohsen Kahani, and Samad Paydar, “Big Data Fuzzy C-means Algorithm based on Bee Colony Optimization using an Apache Hbase,” Journal of Big Data, vol. 8, pp. 1-22, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Yusei Ito et al., “Rational Partitioning of Spectral Feature Space for Effective Clustering of Massive Spectral Image Data,” Scientific Reports, vol. 14, pp. 1-11, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Majjari Sudhakar, and Koteswara Rao Anne, “Optimizing Data Processing for Edge-Enabled IoT Devices using Deep Learning Based Heterogeneous Data Clustering Approach,” Measurement: Sensors, vol. 31, pp. 1-11, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Mandli Rami Reddy et al., “Energy-Efficient Cluster Head Selection in Wireless Sensor Networks using an Improved Grey Wolf Optimization Algorithm,” Computers, vol. 12, no. 2, pp. 1-17, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Mohammadhossein Ghahramani et al., “Intelligent Geodemographic Clustering based on Neural Network and Particle Swarm Optimization,” IEEE Transactions on Systems Man and Cybernetics Systems, vol. 52, no. 6, pp. 3746-3756, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Raj Kumar Paul et al., “Image Compression Scheme based on Optimized K-Means Clustering and Higher-Level Decomposed DWT,” Procedia Computer Science, vol. 235, pp. 642-655, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Tie Li et al., “An Integrated Cluster Detection, Optimization, and Interpretation Approach for Financial Data,” IEEE Transactions on Cybernetics, vol. 52, no. 12, pp. 13848-13861, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Gyanaranjan Shial, Sabita Sahoob, and Sibarama Panigrahi, “An Enhanced GWO Algorithm with Improved Explorative Search Capability for Global Optimization and Data Clustering,” Applied Artificial Intelligence, vol. 37, no. 1, pp. 1-49, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Behnaz Merikhi, and M.R. Soleymani, “Automatic Data Clustering Framework using Nature-Inspired Binary Optimization Algorithms,” IEEE Access, vol. 9, pp. 93703-93722, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Arvinder Kaur, and Yugal Kumar, “A New Metaheuristic Algorithm based on Water Wave Optimization for Data Clustering,” Evolutionary Intelligence, vol. 15, pp. 759-783, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[19] R. Purushothaman, S.P. Rajagopalan, and Gopinath Dhandapani, “Hybridizing Gray Wolf Optimization (GWO) with Grasshopper Optimization Algorithm (GOA) for Text Feature Selection and Clustering,” Applied Soft Computing, vol. 96, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Xinming Zhang et al., “Hybrid Particle Swarm and Grey Wolf Optimizer and its Application to Clustering Optimization,” Applied Soft Computing, vol. 101, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Manoharan Premkumar et al., “Augmented Weighted K-means Grey Wolf Optimizer: An Enhanced Metaheuristic Algorithm for Data Clustering Problems,” Scientific Reports, vol. 14, pp. 1-33, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Hongbo Wang et al., “An Improved Grey Wolf Optimizer with Flexible Crossover and Mutation for Cluster Task Scheduling,” Information Sciences, vol. 704, pp. 1-42, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Alaa E. Abdel-Hakim et al., “Ellipsoidal K-Means: An Automatic Clustering Approach for Non-Uniform Data Distributions,” Algorithms, vol. 17, no. 12, pp. 1-23, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Xian-Fang Song et al., “A Fast Hybrid Feature Selection based on Correlation-Guided Clustering and Particle Swarm Optimization for High-Dimensional Data,” IEEE Transactions on Cybernetics, vol. 52, no. 9, pp. 9573-9586, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Shudong Huang et al., “Robust Deepk-Means: An Effective and Simple Method for Data Clustering,” Pattern Recognition, vol. 117, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Laith Abualigah et al., “Advances in Meta-Heuristic Optimization Algorithms in big Data Text Clustering,” Electronics, vol. 10, no. 2, pp. 1-29, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Khaled H. Almotairi, and Laith Abualigah, “Hybrid Reptile Search Algorithm and Remora Optimization Algorithm for Optimization Tasks and Data Clustering,” Symmetry, vol. 14, no. 3, pp. 1-29, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Saleem Malik et al., “MutaSwarmClus: Enhancing Data Clustering Efficiency with Mutation-Enhanced Swarm Algorithm,” Cluster Computing, vol. 28, pp. 1-24, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Mandakini Behera et al., “Automatic Data Clustering by Hybrid Enhanced Firefly and Particle Swarm Optimization Algorithms,” Mathematics, vol. 10, no. 19, pp. 1-29, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Ramin Ahmadia, Gholamhossein Ekbatanifardb, and Peyman Baya, “A Modified Grey Wolf Optimizer based Data Clustering Algorithm,” Applied Artificial Intelligence, vol. 35, no. 1, pp. 63-79, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Zainab Saadoon Naser, Hend Marouane, and Ahmed Fakhfakh, “Multi-Object-Based Efficient Traffic Signal Optimization Framework via Traffic Flow Analysis and Intensity Estimation Using UCB-MRL-CSFL,” Vehicles, vol. 7, no. 3, pp. 1-27, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Moshiur Rahman Faisal et al., “Context-Aware Data Cleaning: Optimizing Bengali Text for Contextual Text Classification,” SN Computer Science, vol. 6, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[33] P. Cortez, Wine Quality, UCI Machine Learning Repository, 2009.
[CrossRef] [Google Scholar] [Publisher Link]

10.14445/23488549/IJECE-V12I12P109