Enhanced Diagnostic Precision for Cardiovascular Diseases through the Synergistic Application of GDE_Lasso Feature Selection and Random Forest Classification Techniques
DOI:
https://doi.org/10.22399/ijcesen.736Keywords:
Gaussian based Differential Entropy, Information gain, BIC, Modified LASSO, Random ForestAbstract
Cardiovascular diseases (CVD) pose a significant global health challenge, contributing substantially to mortality rates worldwide. Early detection and diagnosis of CVD are critical, and machine learning techniques offer promising avenues for analyzing risk factors and implementing preventive measures. Feature selection methods can also help reduce diagnostic costs. Hence, in this work, Gaussian-based differential entropy for information gain with the Lasso (GDE_Lasso) feature selection model is proposed. The goal is to optimize diagnostics by streamlining processes, minimizing tests, and enabling targeted interventions. The proposed model is evaluated on Cleveland Datasets 1 and 2, respectively. This work compares the performance of Logistic Regression, Naïve Bayes, SVM, KNN, Decision Tree, XG Boost, and Random Forest for the considered datasets by applying the Z-score method. It was found that Random Forest performs well among the considered classifiers. Therefore, this study evaluates the performance of Random Forest with and without applying the GDE_Lasso feature selection algorithm.
References
Kedia, V., Regmi, S.R., Jha, K., Bhatia, A., Dugar, S. and Shah, B.K., (2021). Time Efficient IOS Application For CardioVascular Disease Prediction Using Machine Learning. In 2021 5th International Conference on Computing Methodologies and Communication (ICCMC) (pp. 869-874). IEEE.
Yoshimura, R., Nakagami, T., Hasegawa, Y., Oya, J. and Babazono, T., (2022). Association between changes in body weight and cardiovascular disease risk factors among obese Japanese patients with type 2 diabetes. Journal of Diabetes Investigation, 13(9),1560-1566.
Reddy, N.S.C., Nee, S.S., Min, L.Z. and Ying, C.X., (2019). Classification and feature selection approaches by machine learning techniques: Heart disease prediction. International Journal of Innovative Computing, 9(1).
Djerioui, M., Brik, Y., Ladjal, M. and Attallah, B., (2020), September. Heart Disease prediction using MLP and LSTM models. In 2020 International Conference on Electrical Engineering (ICEE) (pp. 1-5). IEEE
Wankhede, J., Sambandam, P. and Kumar, M., (2022). Effective prediction of heart disease using hybrid ensemble deep learning and tunicate swarm algorithm. Journal of Biomolecular Structure and Dynamics, 40(23),13334-13345.
Vivekanandan, T. and Iyengar, N.C.S.N., (2017). Optimal feature selection using a modified differential evolution algorithm and its effectiveness for prediction of heart disease. Computers in biology and medicine, 90,125-136.
Elavarasi, D., Kavitha, R. and Aanjankumar, S., (2023), December. Navigating Heart Health with an Elephantine Approach in Clinical Decision Support Systems. In 2023 2nd International Conference on Automation, Computing and Renewable Systems (ICACRS) (pp. 1416-1423). IEEE.
Udhan, S. and Patil, B., 2023. Novel Deep Neural Network for Early Prediction and Prevention of Cardiovascular Disease. DOI: 10.21203/rs.3.rs-3294920/v1
Singh, M.S. and Choudhary, P., (2017), August. Stroke prediction using artificial intelligence. In 2017 8th Annual Industrial Automation and Electromechanical Engineering Conference (IEMECON) (pp. 158-161). IEEE.
Salman Pathan, M., Nag, A., Mohisn Pathan, M. and Dev, S., (2022). Analyzing the impact of feature selection on the accuracy of heart disease prediction. arXiv e-prints, pp.arXiv-2206.
Bsoul, M.A., Qusef, A. and Abu-Soud, S., (2022). Building an optimal dataset for arabic fake news detection. Procedia Computer Science, 201,665-672.
Sumwiza, K., Twizere, C., Rushingabigwi, G., Bakunzibake, P. and Bamurigire, P., (2023). Enhanced cardiovascular disease prediction model using random forest algorithm. Informatics in Medicine Unlocked, 41;101316
Jabbar, M.A., Deekshatulu, B.L. and Chandra, P., (2016). Prediction of heart disease using random forest and feature subset selection. In Innovations in Bio-Inspired Computing and Applications: Proceedings of the 6th International Conference on Innovations in Bio-Inspired Computing and Applications (IBICA 2015) held in Kochi, India during December 16-18, 2015 (pp. 187-196). Springer International Publishing.
Iscra, K., Miladinović, A., Ajčević, M., Starita, S., Restivo, L., Merlo, M. and Accardo, A., (2022). Interpretable machine learning models to support differential diagnosis between Ischemic Heart Disease and Dilated Cardiomyopathy. Procedia Computer Science, 207;1378-1387.
Saw, M., Saxena, T., Kaithwas, S., Yadav, R. and Lal, N., (2020), January. Estimation of prediction for getting heart disease using logistic regression model of machine learning. In 2020 International Conference on Computer Communication and Informatics (ICCCI) (pp. 1-6). IEEE.
Mehmood, A., Iqbal, M., Mehmood, Z., Irtaza, A., Nawaz, M., Nazir, T. and Masood, M., (2021). Prediction of heart disease using deep convolutional neural networks. Arabian Journal for Science and Engineering, 46(4),3409-3422.
Sharawi, M., Zawbaa, H.M. and Emary, E., (2017), February. Feature selection approach based on whale optimization algorithm. In 2017 Ninth international conference on advanced computational intelligence (ICACI) (pp. 163-168). IEEE
Saqlain, S.M., Sher, M., Shah, F.A., Khan, I., Ashraf, M.U., Awais, M. and Ghani, A., (2019). Fisher score and Matthews correlation coefficient-based feature subset selection for heart disease diagnosis using support vector machines. Knowledge and Information Systems, 58;139-167.
Aggarwal, V., Gupta, V., Singh, P., Sharma, K. and Sharma, N., (2019), April. Detection of spatial outlier by using improved Z-score test. In 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI) (pp. 788-790). IEEE
Mohamed, S.M., Malhat, M.G. and Elhady, G.F., (2022). prediction of cardiovascular disease using machine learning techniques. IJCI. International Journal of Computers and Information, 9(2);25-44.
Kalaivani, B. and Ranichitra, A., (2022). A comparative study of machine learning approaches for proactive cardiovascular disease prediction. Int J Health Sci, 6(S8),5390-5400.
Liang, Q., Zhao, S., Zhang, J., Deng, H., Damm, W., Hess, D., Schweda, M., Sztipanovits, J., Bengler, K., Biebl, B. and Fränzle, M., (2024). Cyber-physical systems. ACM Transactions on, 8(1).
Gupta, A. and Singh, A., (2023). EDL‐NSGA‐II: Ensemble deep learning framework with NSGA‐II feature selection for heart disease prediction. Expert Systems, 40(7);e13254.
Cai, T.T., Liang, T. and Zhou, H.H., (2015). Law of log determinant of sample covariance matrix and optimal estimation of differential entropy for high-dimensional Gaussian distributions. Journal of Multivariate Analysis, 137;161-172.
ThanhNoi, P. and Kappas, M., (2017). Comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using Sentinel-2 imagery. Sensors, 18(1);18
Kalaivani, B. and Ranichitra, A., (2024). Optimizing Cardiovascular Disease Prediction: Harnessing Random Forest Algorithm with Advanced Feature Selection. DOI: 10.21203/rs.3.rs-3834700/v1
Emmert-Streib, F. and Dehmer, M., (2019). High-dimensional LASSO-based computational regression models: regularization, shrinkage, and selection. Machine Learning and Knowledge Extraction, 1(1), pp.359-383.
Lorah, J. and Womack, A., (2019). Value of sample size for computation of the Bayesian information criterion (BIC) in multilevel modeling. Behavior research methods, 51, pp.440-450.
Ghosh, P., Azam, S., Jonkman, M., Karim, A., Shamrat, F.J.M., Ignatious, E., Shultana, S., Beeravolu, A.R. and De Boer, F., (2021). Efficient prediction of cardiovascular disease using machine learning algorithms with relief and LASSO feature selection techniques. IEEE Access, 9,19304-19326.
Djerioui, M., Brik, Y., Ladjal, M. and Attallah, B., (2020), September. Heart Disease prediction using MLP and LSTM models. In 2020 International Conference on Electrical Engineering (ICEE) (pp. 1-5). IEEE.
Bhuyan, M.K., (2019). Computer vision and image processing: Fundamentals and applications. CRC Press.
Kalaivani, B. and Ranichitra, A., (2023). Unveiling the Impact of Outliers: An Improved Feature Engineering Technique for Heart Disease Prediction. In International Conference on IoT Based Control Networks and Intelligent Systems (pp. 469-478). Singapore: Springer Nature Singapore.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Computational and Experimental Science and Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.