KWHO-CNN: A Hybrid Metaheuristic Algorithm Based Optimzed Attention-Driven CNN for Automatic Clinical Depression Recognition
DOI:
https://doi.org/10.22399/ijcesen.359Keywords:
Deep learning, Biomedical Signal processing, Clinical depression diagnosis, Speech depression recognitionAbstract
Depression is a widespread mental disorder with inconsistent symptoms that make diagnosis challenging in clinical practice and research. Nevertheless, the poor identification may be partially explained by the fact that present approaches ignore patients' vocal tract modifications in favour of merely considering speech perception aspects. This study proposes a novel framework, KWHO-CNN, integrating a hybrid metaheuristic algorithm with Attention-Driven Convolutional Neural Networks (CNNs), to enhance depression detection using speech data. It addresses challenges like variability in speech patterns and small sample sizes by optimizing feature selection and classification. Initial pre-processing involves noise reduction, data normalization, and segmentation, followed by feature extraction, primarily utilizing Mel-frequency cepstral coefficients (MFCCs). The Krill Wolf Hybrid Optimization (KWHO) Algorithm optimizes these features, overcoming issues of over-fitting and enhancing model performance. The Attention-Driven CNN architecture further refines classification, leveraging dense computations and architectural homogeneity. The suggested model outperforms in depression diagnosis, with over 90% accuracy, precision, recall, and F1 score, demonstrating its potential to greatly impact clinical practice and mental health research.
References
. Hammar, Å., Ronold, E.H., &Rekkedal, G.Å. (2022). Cognitive impairment and neurocognitive profiles in major depression—a clinical perspective. Frontiers in Psychiatry. 13: 764374.
. Yang, W., Liu, J., Cao, P., Zhu, R., Wang, Y., Liu, J. K., & Zhang, X. (2023). Attention guided learnable time-domain filterbanks for speech depression detection. Neural Networks.
. Bachmann, S. (2018). Epidemiology of suicide and the psychiatric perspective. International journal of environmental research and public health. 15(7): 1425.
. Vázquez-Romero, A., & Gallardo-Antolín, A. (2020). Automatic detection of depression in speech using ensemble convolutional neural networks. Entropy. 22(6): 688.
. Altwaijri, Y. A., Al‐Subaie, A. S., Al‐Habeeb, A., Bilal, L., Al‐Desouki, M., Aradati, M., & Kessler, R. C. (2020). Lifetime prevalence and age‐of‐onset distributions of mental disorders in the Saudi National Mental Health Survey. International journal of methods in psychiatric research. 29(3): e1836.
. Vitale, F., Carbonaro, B., Cordasco, G., Esposito, A., Marrone, S., Raimo, G., & Verde, L. (2021). A Privacy-Oriented Approach for Depression Signs Detection Based on Speech Analysis. Electronics. 10(23): 2986.
. Esposito, A., Callejas, Z., Hemmje, M. L., Fuchs, M., Maldonato, M. N., &Cordasco, G. (2021). Intelligent Advanced User Interfaces for Monitoring Mental Health Wellbeing. In Advanced Visual Interfaces. Supporting Artificial Intelligence and Big Data Applications: AVI 2020 Workshops, AVI-BDA and ITAVIS, Ischia, Italy, June 9, 2020 and September 29, 2020, Revised Selected Papers. 83-95.
. Aloshban, N., Esposito, A., &Vinciarelli, A. (2020). Detecting depression in less than 10 seconds: Impact of speaking time on depression detection sensitivity. In Proceedings of the 2020 International Conference on Multimodal Interaction. 79-87.
. Tao, F., Esposito, A., &Vinciarelli, A. (2020). Spotting the Traces of Depression in Read Speech: An Approach Based on Computational Paralinguistics and Social Signal Processing. In INTERSPEECH. 1828-1832.
. Esposito, A., Raimo, G., Maldonato, M., Vogel, C., Conson, M., & Cordasco, G. (2020). Behavioral sentiment analysis of depressive states. In 2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom). 000209-000214.
. Jo, A.H., &Kwak, K.C. (2022). Diagnosis of Depression Based on Four-Stream Model of Bi-LSTM and CNN From Audio and Text Information. IEEE Access. 10: 134113-134135.
. Cai, C., Niu, M., Liu, B., Tao, J., & Liu, X. (2021). TDCA-Net: Time-Domain Channel Attention Network for Depression Detection. In Interspeech. 2511-2515.
. Nadeem, A., Naveed, M., Islam Satti, M., Afzal, H., Ahmad, T., & Kim, K.I. (2022). Depression detection based on hybrid deep learning SSCL framework using self-attention mechanism: An application to social networking data. Sensors. 22(24): 9775.
. Guo, T., Zhao, W., Alrashoud, M., Tolba, A., Firmin, S., & Xia, F. (2022). Multimodal educational data fusion for students’ mental health detection. IEEE Access. 10: 70370-70382.
. Park, J., & Moon, N. (2022). Design and implementation of attention depression detection model based on multimodal analysis. Sustainability. 14(6): 3569.
. Prabhudesai, S., Mhaske, A., Parmar, M., & Bhagwat, S. (2021). Depression Detection and Analysis Using Deep Learning: Study and Comparative Analysis. In 2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT). 570-574.
. Huang, Z., Epps, J., Joachim, D., &Sethu, V. (2019). Natural language processing methods for acoustic and landmark event-based features in speech-based depression detection. IEEE Journal of Selected Topics in Signal Processing. 14(2): 435-448.
. Yalamanchili, B., Kota, N.S., Abbaraju, M.S., Nadella, V.S.S., & Alluri, S.V. (2020). Real-time acoustic based depression detection using machine learning techniques. In 2020 International conference on emerging trends in information technology and engineering (ic-ETITE). 1-6.
. Wu, P., Wang, R., Lin, H., Zhang, F., Tu, J., & Sun, M. (2023). Automatic depression recognition by intelligent speech signal processing: A systematic survey. CAAI Transactions on Intelligence Technology. 8(3): 701-711.
. Miao, X., et al. (2022). Fusing features of speech for depression classification based on higher-order spectral analysis. in Speech Communication. 143(1): 46–56.
. Rejaibi, E., Komaty, A., Meriaudeau, F., Agrebi, S., &Othmani, A. (2022). MFCC-based recurrent neural network for automatic clinical depression recognition and assessment from speech. Biomedical Signal Processing and Control. 71: 103107.
. Du, M., Liu, S., Wang, T., Zhang, W., Ke, Y., Chen, L., & Ming, D. (2023). Depression recognition using a proposed speech chain model fusing speech production and perception features. Journal of Affective Disorders. 323: 299-308.
. Marriwala, N., & Chaudhary, D. (2023). A hybrid model for depression detection using deep learning. Measurement: Sensors. 25: 100587.
. Huang, Y., Ma, Y., Xiao, J., Liu, W., & Zhang, G. (2023). Identification of depression state based on multi‐scale acoustic features in interrogation environment. IET Signal Processing. 17(4): e12207.
. Yin, F., Du, J., Xu, X., & Zhao, L. (2023). Depression Detection in Speech Using Transformer and Parallel Convolutional Neural Networks. Electronics. 12(2): 328.
. Alouane, M.T.H., & Jaı, M. (2006). A new nonstationary LMS algorithm for tracking Markovian time varying systems. Signal processing. 86(1): 50-70.
. Haykin, S. (2001). Minimum mean square error adaptive filter. Adaptive Filter Theory, 4th ed. Prentice Hall, Upper Saddle River. 183-228.
. Pitz, M., & Ney, H. (2005). Vocal tract normalization equals linear transformation in cepstral space. IEEE Transactions on Speech and Audio Processing. 13(5): 930-944.
. Lee, L., & Rose, R. (1998). A frequency warping approach to speaker normalization. IEEE Transactions on speech and audio processing. 6(1): 49-60.
. Tharwat, A. (2021). Independent component analysis: An introduction. Applied Computing and Informatics. 17(2): 222-249.
. Gandomi, A.H., &Alavi, A.H. (2012). Krill herd: a new bio-inspired optimization algorithm. Communications in nonlinear science and numerical simulation. 17(12): 4831-4845.
. Mirjalili, S., Mirjalili, S.M., & Lewis, A. (2014). Grey wolf optimizer. Advances in engineering software. 69: 46-61.
. Lin, M., Chen, Q., & Yan, S. (2013). Network in network. arXiv preprint arXiv:1312.4400.
. Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132-7141.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 International Journal of Computational and Experimental Science and Engineering
This work is licensed under a Creative Commons Attribution 4.0 International License.