Hyperparameter Tuning of Random Forest using Social Group Optimization Algorithm for Credit Card Fraud Detection in Banking Data

Authors

  • Sudhirvarma Sagiraju KIIT Deemed to be University
  • Jnyana Ranjan Mohanty KIIT Deemed to be University
  • Anima Naik Raghu Engineering College

DOI:

https://doi.org/10.22399/ijcesen.777

Keywords:

SGO, Random Forest, accuracy, hyperparameters, Credit Card Fraud Detection

Abstract

As the adoption of credit cards continues to expand alongside advancements in e-commerce, the frequency and complexity of fraudulent activities have also grown, posing significant challenges for the financial sector. Detecting fraudulent transactions within highly imbalanced datasets remains a critical issue in ensuring secure banking operations. This study explores a robust approach RF_SGO to credit card fraud detection by combining pre-processing techniques such as Synthetic Minority Oversampling Technique (SMOTE) and class weight adjustment with Random Forest (RF) models optimized using the Social Group Optimization (SGO) algorithm. Additionally, the study utilizes Random Forest's feature importance mechanism to identify the most influential features contributing to fraud detection, enhancing interpretability and decision-making. Our methodology evaluates RF_SGO across three datasets: the original European cardholders' imbalanced dataset, a class-weight-adjusted dataset, and a SMOTE-enhanced dataset. Model performance is measured using key metrics, including Accuracy, Precision, Recall, F1-Score, and ROC-AUC. The RF_SGO model demonstrated superior performance, with the SMOTE-enhanced variant achieving the highest ROC-AUC (0.98) and Recall (0.88), effectively balancing sensitivity and specificity. The class-weighted RF_SGO achieved the highest Precision (0.96), making it ideal for minimizing false positives. Furthermore, the feature importance analysis identified key predictors of fraudulent behavior, providing actionable insights for financial institutions. Comparisons with traditional machine learning algorithms (e.g., Logistic Regression, Decision Trees, and SVM) and advanced models (e.g., XGBoost, CatBoost, and deep learning) highlight RF_SGO's ability to outperform in precision-recall trade-offs and overall classification effectiveness. This study underscores the significance of incorporating hyperparameter tuning, feature importance analysis, and data balancing strategies to improve fraud detection. The proposed RF_SGO framework offers a scalable and efficient solution for financial institutions to mitigate fraud, ensuring more reliable and secure transaction systems.

References

Nanduri, J., Liu, Y.-W., Yang, K., & Jia, Y. (2020). Ecommerce fraud detection through fraud islands and multi-layer machine learning model. In Future of Information and Communication Conference (pp. 556–570). Springer.

Matloob, I., Khan, S. A., Rukaiya, R., Khattak, M. A. K., & Munir, A. (2022). A sequence mining-based novel architecture for detecting fraudulent transactions in healthcare systems. IEEE Access, 10, 48447–48463. https://doi.org/10.1109/ACCESS.2022.3171418

Sulaiman, B. R., Schetinin, V., & Sant, P. (2022). Review of machine learning approach on credit card fraud detection. Human-Centric Intelligent Systems, 2, 55–68. https://doi.org/10.1007/s44230-022-00004-0

Dornadula, V. N., & Geetha, S. (2019). Credit card fraud detection using machine learning algorithms. Procedia Computer Science, 165, 631–641. https://doi.org/10.1016/j.procs.2020.01.057

Sekar, M. (2022). Fraud and anomaly detection. In Machine Learning for Auditors (pp. 321–340). Apress. https://doi.org/10.1007/978-1-4842-8051-5_21

Kaggle. (n.d.). The credit card fraud. Kaggle. Retrieved from https://www.kaggle.com/mlg-ulb/creditcardfraud

Satapathy, S., & Naik, A. (2016). Social group optimization (SGO): A new population evolutionary optimization technique. Complex & Intelligent Systems, 2(3), 173–203.

Naik, A., et al. (2018). Social group optimization for global optimization of multimodal functions and data clustering problems. Neural Computing & Applications, 30(1), 271–287. https://doi.org/10.1007/s00521-016-2614-6

Naik, A., & Chokkalingam, P. K. (2022). Binary social group optimization algorithm for solving 0-1 knapsack problem. Decision Science Letters, 11(1), 55–72.

Monisha, R., et al. (2019). Social Group Optimization and Shannon’s Function-Based RGB Image Multi-level Thresholding. In Smart Intelligent Computing and Applications (pp. 123–132). Springer, Singapore.

Reddy, A., & Narayana, K. V. L. (2022). Investigation of a multi-strategy ensemble social group optimization algorithm for the optimization of energy management in electric vehicles. IEEE Access, 10, 12084–12124.

Manic, K. S., Al Shibli, N., & Al Sulaimi, R. (2018). SGO and Tsallis entropy-assisted segmentation of abnormal regions from brain MRI. Journal of Engineering Science and Technology, 13, 52–62.

Parwekar, P. (2018). SGO: A new approach for energy efficient clustering in WSN. International Journal of Natural Computing Research, 7(3), 54–72.

Pant, M., et al. (2008). Improved particle swarm optimization with low-discrepancy sequences. In 2008 IEEE Congress on Evolutionary Computation (Vols 1–8, pp. 3011–3018).

Naik, A., Jena, J. J., & Satapathy, S. C. (2021). Non-dominated sorting social group optimization algorithm for multi-objective optimization. Journal of Scientific & Industrial Research.

Naik, A. (2023). Chaotic social group optimization for structural engineering design problems. Journal of Bionic Engineering, 20, 1852–1877. https://doi.org/10.1007/s42235-023-00340-2

Naik, A. (2024). Marine predators social group optimization: A hybrid approach. Evolutionary Intelligence, 17, 2355–2386. https://doi.org/10.1007/s12065-023-00891-7

Naik, A. (2024). Multi-objective social group optimization for machining process. Evolutionary Intelligence, 17, 1655–1676. https://doi.org/10.1007/s12065-023-00856-w

Campus, K. (2018). Credit card fraud detection using machine learning models and collating machine learning models. International Journal of Pure and Applied Mathematics, 118(20), 825–838.

Varmedja, D., Karanovic, M., Sladojevic, S., Arsenovic, M., & Anderla, A. (2019). Credit card fraud detection-machine learning methods. In 18th International Symposium INFOTEH-JAHORINA (pp. 1–5).

Khatri, S., Arora, A., & Agrawal, A. P. (2020). Supervised machine learning algorithms for credit card fraud detection: A comparison. In 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence) (pp. 680–683).

Awoyemi, J. O., Adetunmbi, A., & Oluwadare, S. (2018). Effect of feature ranking on the detection of credit card fraud: Comparative evaluation of four techniques. I-Manage Journal of Pattern Recognition, 5(3), 10.

Dornadula, V. N., & Geetha, S. (2019). Credit card fraud detection using machine learning algorithms. Procedia Computer Science, 165, 631–641.

Seera, M., Lim, C. P., Kumar, A., Dhamotharan, L., & Tan, K. H. (2021). An intelligent payment card fraud detection system. Annals of Operations Research, 1–23.

Khalilia, M., Chakraborty, S., & Popescu, M. (2011). Predicting disease risks from highly imbalanced data using random forest. BMC Medical Informatics and Decision Making, 11, 1–13.

Rtayli, N., & Enneya, N. (2020). Selection features and support vector machine for credit card risk identification. Procedia Manufacturing, 46, 941–948.

Ileberi, E., Sun, Y., & Wang, Z. (2022). A machine learning based credit card fraud detection using the GA algorithm for feature selection. Journal of Big Data, 9(1), 1–17.

Khan, M. Z., Shaikh, S. A., Shaikh, M. A., Khatri, K. K., Rauf, M. A., Kalhoro, A., Adnan, M. (2022). The performance analysis of machine learning algorithms for credit card fraud detection. International Journal of Online Engineering, 19(03), 83.

Agarwal, A., & Ratha, N. K. (2021). Black-box adversarial entry in finance through credit card fraud detection. In CIKM Workshops.

Noviandy, T. R., Idroes, G. M., Maulana, A., Hardi, I., Ringga, E. S., & Idroes, R. (2023). Credit card fraud detection for contemporary financial management using XGBoost driven machine learning and data augmentation techniques. Indatu Journal of Management and Accounting, 1(1), 29–35.

Sinap, V. (2024). Comparative analysis of machine learning techniques for credit card fraud detection: Dealing with imbalanced datasets. Turkish Journal of Engineering, 8(2), 196–208.

Naik, A., Satapathy, S. C., & Abraham, A. (2020). Modified Social Group Optimization—a meta-heuristic algorithm to solve short-term hydrothermal scheduling. Applied Soft Computing, 95, 106513. https://doi.org/10.1016/j.asoc.2020.106513

Chawla, N., Bowyer, K., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. ArXiv, abs/1106.1813.

He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284. https://doi.org/10.1109/TKDE.2008.239

Archer, K. J., & Kimes, R. V. (2008). Empirical characterization of random forest variable importance measures. Computational Statistics & Data Analysis, 52(4), 2249–2260. https://doi.org/10.1016/j.csda.2007.08.015

Powers, D. M. (2020). Evaluation: From precision, recall, and F-measure to ROC, informedness, markedness, and correlation. arXiv preprint arXiv:2010.16061.

Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010

Pedregosa, F., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.

Scikit-learn User Guide. (n.d.). Retrieved from https://scikit-learn.org/stable/user_guide.html

Pajankar, A., & Joshi, A. (2022). Getting started with NumPy. In Hands-on Machine Learning with Python (pp. 23–30). Apress. https://doi.org/10.1007/978-1-4842-7921-2_2

Matplotlib Overview. (n.d.). Retrieved from https://matplotlib.org/stable/contents.html

Pajankar, A., & Joshi, A. (2022). Introduction to pandas. In Hands-on Machine Learning with Python (pp. 45–61). Apress. https://doi.org/10.1007/978-1-4842-7921-2_4

Waskom, M. (n.d.). Seaborn User Guide and Tutorial. Retrieved from https://seaborn.pydata.org/tutorial.html

Nistor, S. C., & Czibula, G. (2022). IntelliSwAS: Optimizing deep neural network architectures using a particle swarm-based approach. Expert Systems with Applications, 187, 115945.

Ghosh, A., Jana, N. D., Mallik, S., & Zhao, Z. (2022). Designing optimal convolutional neural network architecture using differential evolution algorithm. Patterns, 3(9), 100567.

Hashemi, S. K., Mirtaheri, S. L., & Greco, S. (2023). Fraud detection in banking data by machine learning techniques. IEEE Access, 11, 3034–3043. https://doi.org/10.1109/ACCESS.2023.3275174

Downloads

Published

2025-01-29

How to Cite

Sagiraju, S., Mohanty, J. R., & Naik, A. (2025). Hyperparameter Tuning of Random Forest using Social Group Optimization Algorithm for Credit Card Fraud Detection in Banking Data. International Journal of Computational and Experimental Science and Engineering, 11(1). https://doi.org/10.22399/ijcesen.777

Issue

Section

Research Article