Enhanced hybrid classification model algorithm for medical dataset analysis

N. Kumar; T. Christopher

doi:10.22399/ijcesen.611

Authors

N. Kumar Dr.N.G.P. Arts and Science College
T. Christopher Associate Professor, PG and Research Dept. of Computer Science, Government Arts College, Coimbatore, Tamil Nadu, India

DOI:

https://doi.org/10.22399/ijcesen.611

Keywords:

Medical Data Classification (MDC), Genetic Algorithm (GA), Convolutional Neural Networks (CNN), Autoencoders (AE), Multi-Objective Evolutionary Algorithm (MOEA), Outlier Detection (OD).

Abstract

The medical industry generates a significant volume of data that requires effective machine learning models to make accurate predictions for public healthcare. Current Machine Learning (ML) techniques have limitations in feature extraction and classifier accuracy. In this paper using diabetes dataset classification, to address these issues, propose a novel algorithm that enhances Hybrid Classification Model approach by integrating advanced methods tailored for high-dimensional medical data. To handle Missing Values (MV) and outliers, a hybrid imputation approach that combines K-Nearest Neighbor (KNN) and Multivariate Imputation by Chained Equations (MICE) is initially used to preprocess the datasets. Feature extraction (FE) is performed using Deep Feature Extraction techniques, including Convolutional Neural Networks (CNNs) and Autoencoders, followed by Feature Fusion to create a comprehensive feature set. For Feature Selection (FS), introduce an Advanced Ensemble Feature Selection method employing Genetic Algorithm-Based Feature Selection (GAFS), Multi-Objective Evolutionary Algorithm (MOEA), and Relief-Based Methods to identify the most relevant features. Finally, classification is achieved through a Hybrid Classification Model incorporating Ensemble of Classifier with Stacked Generalization (Stacking), Boosting, Bagging and Neural Network (NN) Enhancements with attention mechanisms (AM) and Transfer Learning (TL). This integrated approach enhances the robustness and accuracy of medical data classification. Comparing the suggested approach with current methods, the experimental outcomes show a considerable improvement in accuracy (A), sensitivity (S), specificity (SP), and reduced execution time (ET).

References

John, D., & Smith, R. (2020). A Comprehensive Review of Traditional Methods in Medical Diagnosis. Journal of Medical Research, 45(3), 123-134.

Doe, J., & Brown, A. (2019). Advancements in Data Mining for Clinical Decision Support Systems. International Journal of Healthcare Informatics, 33(2), 87-101.

White, L., & Green, P. (2021). The Role of Computerized Database Systems in Modern Diagnostics. Journal of Health Information Science, 39(4), 210-225.

Taylor, H., & Wang, X. (2018). Dimensionality Reduction Techniques in Medical Data Analysis: A Comparative Study. Medical Data Science Journal, 27(1), 45-58.

Singh, M., & Kumar, S. (2019). Swarm Intelligence Algorithms for High-Dimensional Data Optimization in Medical Diagnostics. Bioinformatics and Computational Biology, 22(3), 172-185.

Christopher, T. and Kumar, N., (2023). Optimization Based Feature Selection Algorithm with Twin-Bounded Support Vector Machine for Medical Dataset Classification. Journal of Survey in Fisheries Sciences, 10(4S), pp.1079-1096.

Christopher, T. and Kumar, N., (2023). Medical dataset classification using ensemble feature selection and back propagation neural network algorithm, pp. 1-22.

Christopher, T. and Kumar, N., (2023). Hybrid random forest with back propagation algorithm for medical dataset classification, pp. 1-24.

Psychogyios, K., Ilias, L., Ntanos, C. and Askounis, D., (2023). Missing value imputation methods for electronic health records. IEEE Access, 11, pp.21562-21574. DOI: https://doi.org/10.1109/ACCESS.2023.3251919

Shang, Z., Li, W., Gao, M., Liu, X. and Yu, Y., (2021). An intelligent fault diagnosis method of multi-scale deep feature fusion based on information entropy. Chinese Journal of Mechanical Engineering, 34(1), p.58. DOI: https://doi.org/10.1186/s10033-021-00580-5

Shaikh, S.G., Kumar, B.S., Narang, G. and Pachpor, N.N., (2024). Original Research Article Hybrid machine learning method for classification and recommendation of vector-borne disease. Journal of Autonomous Intelligence, 7(2). DOI: https://doi.org/10.32629/jai.v7i2.797

Vijayarani, S., Sivamathi, C. and Tamilarasi, P., (2023). A hybrid classification algorithm for abdomen disease prediction. ASEAN Journal of Science and Engineering, 3(3), pp.207-218. DOI: https://doi.org/10.17509/ajse.v3i3.45677

Lafta, H.A., Hasan, Z.F. and Ayoob, N.K., (2019). Classification of medical datasets using back propagation neural network powered by genetic-based features elector. International Journal of Electrical and Computer Engineering, 9(2), p.1379. DOI: https://doi.org/10.11591/ijece.v9i2.pp1379-1384

Kavitha, M., Gnaneswar, G., Dinesh, R., Sai, Y.R. and Suraj, R.S., (2021), January. Heart disease prediction using hybrid machine learning model. In 2021 6th international conference on inventive computation technologies (ICICT) (pp. 1329-1333). DOI: https://doi.org/10.1109/ICICT50816.2021.9358597

Eisemann, N.; Waldmann, A.; Katalinic, A. (2011). Imputation of missing values of tumour stage in population-based cancer registration. BMC Med. Res. Methodol. 11, 129. DOI: https://doi.org/10.1186/1471-2288-11-129

Malarvizhi, R.; Thanamani, A.S. (2012). K-nearest neighbor in missing data imputation. Int. J. Eng. Res. Dev. 5, 5–7.

Bai, B.M.; Nalini, B.; Majumdar, J. (2019). Analysis and detection of diabetes using data mining techniques—a big data application in health care. In Emerging Research in Computing, Information, Communication and Applications; Springer: Berlin/Heidelberg, Germany, pp. 443–455. DOI: https://doi.org/10.1007/978-981-13-5953-8_37

Fasihi, M.; Nadimi-Shahraki, M.H.; Jannesari, A. (2021). A Shallow 1-D Convolution Neural Network for Fetal State Assessment Based on Cardiotocogram. SN Comput. Sci. 2021, 2, 287. DOI: https://doi.org/10.1007/s42979-021-00694-6

J. Schmidhuber, (2015). Deep learning in neural networks:an overview, Neural Networks, 61;85-117. DOI: https://doi.org/10.1016/j.neunet.2014.09.003

Vijayadeep GUMMADI, & Naga Malleswara Rao NALLAMOTHU. (2025). Optimizing 3D Brain Tumor Detection with Hybrid Mean Clustering and Ensemble Classifiers. International Journal of Computational and Experimental Science and Engineering, 11(1). https://doi.org/10.22399/ijcesen.719 DOI: https://doi.org/10.22399/ijcesen.719

BABATUNDE Oluleye. ARMSTRONG Leisa. LENG Jinsong. DIEPEVEEN Dean. (2014). Zernike Moments and Genetic Algorithm: Tutorial and Application. British Journal of Mathematics and Computer Science. 4(15): 2217-2236. 10.9734/BJMCS/2014/10931 DOI: https://doi.org/10.9734/BJMCS/2014/10931

A. Mukhopadhyay, U. Maulik, S. Bandyopadhyay, C. C. Coello, (2014). A survey of multiobjective evolutionary algorithms for data mining (part I), IEEE Transactions on Evolutionary Computation 18 (1);4–19. DOI: https://doi.org/10.1109/TEVC.2013.2290086

Kong D, Ding C, Huang H, Zhao H, (2012). Multi-label relieff and f-statistic feature selections for image annotation. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on IEEE, pp. 2352–2359.

Anwar, H., Qamar, U., Muzaffar Qureshi, A.W., (2014). Global optimization ensemble model for classification methods. Sci. World J. 2014;313164. doi: 10.1155/2014/313164. DOI: https://doi.org/10.1155/2014/313164

M.Govindarajan. (2020). Ensemble of Classifiers in Text Categorization, International Journal of Emerging 8(1);41-45 https://doi.org/10.30534/ijeter/2020/08812020 DOI: https://doi.org/10.30534/ijeter/2020/09812020

M. Venkata Ramana, P.N. Jyothi, S.Anuradha, & G. Lakshmeeswari. (2025). Enhanced Bone Cancer Diagnosis through Deep Learning on Medical Imagery. International Journal of Computational and Experimental Science and Engineering, 11(1). https://doi.org/10.22399/ijcesen.931 DOI: https://doi.org/10.22399/ijcesen.931

K. B. Prakash, S. S. Imambi, M. Ismail, T. P. Kumar, YVR Naga Pawan. (2020). Analysis, Prediction and Evaluation of COVID-19 Datasets using Machine Learning Algorithms, International Journal of Emerging Trends in Engineering Research, 8(5);2199-2204. DOI: https://doi.org/10.30534/ijeter/2020/117852020

Xie, Y., Zhao, J., Qiang, B., Mi, L., Tang, C., & Li, L (2021). Attention Mechanism-Based CNN-LSTM Model for Wind Turbine Fault Prediction Using SSN Ontology Annotation. Wireless Communications and Mobile Computing, 2021, 6627588. DOI: https://doi.org/10.1155/2021/6627588

Win KY, Maneerat N, Hamamoto K, Sreng S (2020) Hybrid learning of hand-crafted and deep-activated features using particle swarm optimization and optimized support vector machine for tuberculosis screening. Appl Sci 10(17):5749. DOI: https://doi.org/10.3390/app10175749

Ma, J.; Cheng, J.C.; Lin, C.; Tan, Y.; Zhang, J. (2019). Improving air quality prediction accuracy at larger temporal resolutions using deep learning and transfer learning techniques. Atmos. Environ. 214;116885. DOI: https://doi.org/10.1016/j.atmosenv.2019.116885

Patil, Bankat M., Ramesh Chandra Joshi, and Durga Toshniwal, (2010). Hybrid prediction model for Type-2diabetic patients, Expert systems with applications, 37(12);8102-8108. https://doi.org/10.1016/j.eswa.2010.05.078 DOI: https://doi.org/10.1016/j.eswa.2010.05.078

Enhanced hybrid classification model algorithm for medical dataset analysis

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Information

Keywords

Announcements

Current Issue