Performance and Generalization Analysis of Machine Learning Models for Potential Fishing Zone Classification
| International Journal of Electrical and Electronics Engineering |
| © 2025 by SSRG - IJEEE Journal |
| Volume 12 Issue 12 |
| Year of Publication : 2025 |
| Authors : Nik Nur Shaadah Nik Dzulkefli, Norsuzila Ya’acob, Mohd Azri Abdul Aziz, Azita Laily Yusof, Syila Izawana Ismail |
How to Cite?
Nik Nur Shaadah Nik Dzulkefli, Norsuzila Ya’acob, Mohd Azri Abdul Aziz, Azita Laily Yusof, Syila Izawana Ismail, "Performance and Generalization Analysis of Machine Learning Models for Potential Fishing Zone Classification," SSRG International Journal of Electrical and Electronics Engineering, vol. 12, no. 12, pp. 28-40, 2025. Crossref, https://doi.org/10.14445/23488379/IJEEE-V12I12P103
Abstract:
Potential Fishing Zones (PFZ) classification using satellite and environmental data remains challenging due to the complexity of marine environments. Accurate PFZ classification is needed for ensuring sustainability, reducing search time, and optimizing fishing resources. This study investigates the effectiveness of Machine Learning (ML) models in predicting PFZ using a dataset of satellite-derived oceanographic and climate features. An initial set of seven features was refined to six key features using Random Forest (RF) and Recursive Feature Elimination (RFE) to enhance model performance by focusing on the most important features. Five classification models, including Random Forest (RF), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Long Short-Term Memory (LSTM), were tested using accuracy, precision, recall, and F1-score across the training, validation, and testing phases. RF showed strong generalization by achieving a perfect F1-score of 100% in training and 93% in testing. The key contribution of this study is showing that RF can outperform more complex models by achieving a better balance between interpretability when combined with feature selection. These findings provide practical implications for fisheries management by offering RF-based frameworks as reliable decision-support tools and guiding future PFZ research toward integrating feature selection to improve robustness.
Keywords:
Feature Selection, Machine Learning (ML), Potential Fishing Zone (PFZ), Random Forest (RF), Remote Sensing.
References:
[1] Dinarika Jatisworo et al., “Bali Strait‘s Potential Fishing Zone of Sardinella Lemuru,” Indonesian Journal of Geography, vol. 54, no. 2, pp. 254-262, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Sukardi et al., “Prediction Potential Fishing Zone of the Yellowfin Tuna (Thunnus Albacares) in the Southern Flores Sea using Satellite Remote Sensing Data,” Egyptian Journal of Aquatic Biology and Fisheries, vol. 28, no. 4, pp. 627-645, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[3] Robinson Mugo, and Sei-Ichi Saitoh, “Ensemble Modelling of Skipjack Tuna (Katsuwonus Pelamis) Habitats in the Western North Pacific using Satellite Remotely Sensed Data; A Comparative Analysis using Machine-Learning Models,” Remote Sensing, vol. 12, no. 16, pp. 1-15, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Yeny Nadira Kamaruzzaman, Muzzneena Ahmad Mustapha, and Mazlan Abd Ghaffar, “Determination of Fishing Grounds Distribution of the Indian Mackerel in Malaysia’s Exclusive Economic Zone Off South China Sea using Boosted Regression Trees Model,” International Journal of Marine Sciences, vol. 37, no. 1, pp. 147-161, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[5] M. Sivasankari, R. Anandan, and Fekadu Ashine Chamato, “HE-DFNETS: A Novel Hybrid Deep Learning Architecture for the Prediction of Potential Fishing Zone Areas in Indian Ocean using Remote Sensing Images,” Computational Intelligence and Neuroscience, vol. 2022, no. 1, pp. 1-10, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[6] Ming Kun Tan, and Muzzneena Ahmad Mustapha, “Application of the Random Forest Algorithm for Mapping Potential Fishing Zones of Rastrelliger Kanagurta of the East Coast of Peninsular Malaysia,” Regional Studies Marine Science, vol. 60, pp. 1-15, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Jie Zhang et al., “Forecasting Albacore (Thunnus Alalunga) Fishing Grounds in the South Pacific based on Machine Learning Algorithms and Ensemble Learning Model,” Applied Sciences, vol. 13, no. 9, pp. 1-19, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[8] Shota Kunimatsu et al., “Predicting Unseen Chub Mackerel Densities through Spatiotemporal Machine Learning: Indications of Potential Hyperdepletion in Catch-Per-Unit-Effort Due to Fishing Ground Contraction,” Ecological Informatics, vol. 85, pp. 1-14, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Vishal Singh Rawat, Gubash Azhikodan, and Katsuhide Yokoyama, “Prediction of Fish (Coilia Nasus) Catch using Spatiotemporal Environmental Variables and Random Forest Model in a Highly Turbid Macrotidal Estuary,” Ecological Informatics, vol. 86, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Andrés Flores et al., “Applying Machine Learning to Predict Reproductive Condition in Fish,” Ecological Informatics, vol. 80, pp. 1-10, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[11] J. Samarão et al., “Improving Machine Learning Predictions to Estimate Fishing Effort using Vessel’s Tracking Data,” Ecological Informatics, vol. 85, pp. 1-15, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Guangpo Geng et al., “Random Forest Model that Incorporates Solar-Induced Chlorophyll Fluorescence Data Can Accurately Track Crop Yield Variations under Drought Conditions,” Ecological Informatics, vol. 85, pp. 1-15, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Judith Aviña-Hernández et al., “Predictive Performance of Random Forest on the Identification of Mangrove Species in Arid Environments,” Ecological Informatics, vol. 75, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Brice B. Hanberry, “Practical Guide for Retaining Correlated Climate Variables and Unthinned Samples in Species Distribution Modeling, using Random Forests,” Ecological Informatics, vol. 79, pp. 1-11, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Hyunsoo Kim et al., “Interpretable General Thermal Comfort Model based on Physiological Data from Wearable Bio Sensors: Light Gradient Boosting Machine (LightGBM) and SHapley Additive exPlanations (SHAP),” Building Environment, vol. 266, pp. 1-16, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[16] Chenxi Ni et al., “Light Gradient Boosting Machine (LightGBM) for Forecasting Data and Assisting in the Defrosting Strategy Design of Refrigerators. Experimental Study and Mathematical Modeling,” International Journal of Refrigeration, vol. 160, pp. 182-196, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Yu Fu et al., “Prediction and Analysis of Sea Surface Temperature based on LSTM-Transformer Model,” Regional Studies Marine Science, vol. 78, pp. 1-13, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Xue Ji, Bisheng Yang, and Qiuhua Tang, “Seabed Sediment Classification using Multibeam Backscatter Data based on the Selecting Optimal Random Forest Model,” Applied Acoustics, vol. 167, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Nurnadiah Zamri et al., “River Quality Classification using Different Distances in k-Nearest Neighbors Algorithm,” Procedia Computer Science, vol. 204, pp. 180-186, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[20] Seyed Hossein Seyed Ebrahimi, “A Hybrid Principal Label Space Transformation-Based Binary Relevance Support Vector Machine and Q-Learning Algorithm for Multi-label Classification,” Arab Journal Science Engineering, vol. 50, no. 2, pp. 851-875, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Ch. Sanjeev Kumar Dash et al., “An Outliers Detection and Elimination Framework in Classification Task of Data Mining,” Decision Analytics Journal, vol. 6, pp. 1-8, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Sheikh Muhammad Saqib et al., “Enhancing Electricity Theft Detection with ADASYN-Enhanced Machine Learning Models,” Electrical Engineering, vol. 107, no. 8, pp. 10525-10542, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[23] Amalesh Gope et al., “Multi-Class Identification of Tonal Contrasts in Chokri using Supervised Machine Learning Algorithms,” Humanity Social Science Communications, vol. 11, no. 1, pp. 1-11, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Huanjing Wang et al., “Feature Selection Strategies: A Comparative Analysis of SHAP-Value and Importance-Based Methods,” Journal Big Data, vol. 11, no. 1, pp. 1-16, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Reza Iranzad, and Xiao Liu, “A Review of Random Forest-Based Feature Selection Methods for Data Science Education and Applications,” International Journal Data Science Analytics, vol. 20, no. 2, pp. 197-211, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Ya-Han Hu et al., “A Novel Miss Forest-Based Missing Values Imputation Approach with Recursive Feature Elimination in Medical Applications,” BMC Medical Research Methodology, vol. 24, no. 1, pp. 1-12, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Chang Wang et al., “Multi-Feature Fusion RFE Random Forest for Schizophrenia Classification and Treatment Response Prediction,” Science Reports, vol. 15, no. 1, pp. 1-13, 2025.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Mengqing Huang, Hongchuan Yu, and Jianjun Zhang, “A Practical Generalization Metric for Deep Networks Benchmarking,” Scientific Reports, vol. 15, no. 1, pp. 1-11, 2025.
[CrossRef] [Google Scholar] [Publisher Link]

10.14445/23488379/IJEEE-V12I12P103