Utilising Machine Learning Algorithms to Address Computational Challenges in Big Data Analytics
DOI:
https://doi.org/10.22399/ijcesen.3165Keywords:
Big Data Analytics, Machine Learning Algorithms, Supervised Learning, Unsupervised Learning, Deep Learning, ScalabilityAbstract
The rapid growth of data across industries including finance, healthcare, and e-commerce has led to significant computational hurdles in big data analytics. Challenges include dimensionality reduction, scalability, real-time processing, and the handling of large data volumes. The use of sophisticated machine learning (ML) algorithms is necessary to tackle these complexity, as traditional data processing methods are insufficient. This paper examines the efficacy of supervised, unsupervised, and deep learning algorithms to improve the efficiency, scalability, and accuracy of data processing to overcome these computing problems. Random Forest, XGBoost, K-means clustering, Convolutional Neural Networks (CNNs), and Long Short-Term Memory Networks (LSTMs) are approaches utilized in the analysis of extensive datasets. The findings demonstrate that CNNs surpass alternative models in image-based datasets, achieving an accuracy of 93.8%. XGBoost attains an equilibrium between computing efficiency and accuracy (91.7%) in consumer segmentation and fraud detection. The remarkable scalability of K-means clustering makes it a suitable technique for analyzing customer behavior. The study incorporates distributed platforms like Apache Spark and TensorFlow to address critical difficulties, including high memory consumption, real-time data processing, and model interpretability. The findings align with existing studies and highlight the importance of scalable and resource-efficient machine learning methods. Moreover, the research provides significant insights into the capacity of stream processing frameworks and hybrid methodologies to improve real-time analytics. This study significantly advances the burgeoning domain of big data analytics by offering pragmatic machine learning techniques. This approach guarantees the effective management of large volumes of data by producing actionable insights.
References
[1] Sestino, A., Prete, M. I., Piper, L., & Guido, G. (2020). Internet of Things and Big Data as enablers for business digitalization strategies. Technovation, 98, 102173.
[2] Saranya, P., & Asha, P. (2019, November). Survey on big data analytics in health care. In 2019 International conference on smart systems and inventive technology (ICSSIT) (pp. 46-51). IEEE.
[3] Ahmed, Z., Mohamed, K., Zeeshan, S., & Dong, X. (2020). Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine. Database, 2020, baaa010.
[4] Pejić Bach, M., Krstić, Ž., Seljan, S., & Turulja, L. (2019). Text mining for big data analysis in financial sector: A literature review. Sustainability, 11(5), 1277.
[5] Bediako, G. (2023). The application of Big Data Analytics in improving eCommerce processes. The Retail sector user experience.
[6] Cherenkov, E., Benga, V., Lee, M., Nandwani, N., Raguin, K., Sueur, M. C., & Sun, G. (2024). From Machine Learning Algorithms to Superior Customer Experience: Business Implications of Machine Learning-Driven Data Analytics in the Hospitality Industry. Journal of Smart Tourism, 4(2), 5-14.
[7] Sahal, R., Breslin, J. G., & Ali, M. I. (2020). Big data and stream processing platforms for Industry 4.0 requirements mapping for a predictive maintenance use case. Journal of manufacturing systems, 54, 138-151.
[8] Rehan, H. (2023). Internet of Things (IoT) in Smart Cities: Enhancing Urban Living Through Technology. Journal of Engineering and Technology, 5(1), 1-16.
[9] Thudumu, S., Branch, P., Jin, J., & Singh, J. (2020). A comprehensive survey of anomaly detection techniques for high dimensional big data. Journal of Big Data, 7, 1-30.
[10] Obilikwu, P. O., Kwaghtyo, K. D., & Udo, E. N. (2021). Volume-Adaptive Big Data Model for Relational Databases. International Journal, 10(3).
[11] Isah, H., Abughofa, T., Mahfuz, S., Ajerla, D., Zulkernine, F., & Khan, S. (2019). A survey of distributed data stream processing frameworks. IEEE Access, 7, 154300-154316.
[12] Celik, B., & Vanschoren, J. (2021). Adaptation strategies for automated machine learning on evolving data. IEEE transactions on pattern analysis and machine intelligence, 43(9), 3067-3078.
[13] Mir, A. A. (2024). Optimizing Mobile Cloud Computing Architectures for Real-Time Big Data Analytics in Healthcare Applications: Enhancing Patient Outcomes through Scalable and Efficient Processing Models. Integrated Journal of Science and Technology, 1(7).
[14] Georgiou, T., Liu, Y., Chen, W., & Lew, M. (2020). A survey of traditional and deep learning-based feature descriptors for high dimensional data in computer vision. International Journal of Multimedia Information Retrieval, 9, 135-170.
[15] Queirós, J. A. B. (2021). Implementing Hadoop distributed file system (hdfs) Cluster for BI Solution.
[16] Tufail, S., Riggs, H., Tariq, M., & Sarwat, A. I. (2023). Advancements and challenges in machine learning: A comprehensive review of models, libraries, applications, and algorithms. Electronics, 12(8), 1789.
[17] Ahmad, T., Madonski, R., Zhang, D., Huang, C., & Mujeeb, A. (2022). Data-driven probabilistic machine learning in sustainable smart energy/smart energy systems: Key developments, challenges, and future research opportunities in the context of smart grid paradigm. Renewable and Sustainable Energy Reviews, 160, 112128.
[18] Rani, R., Khurana, M., Kumar, A., & Kumar, N. (2022). Big data dimensionality reduction techniques in IoT: Review, applications and open research challenges. Cluster Computing, 25(6), 4027-4049.
[19] Psychogyios, K., Ilias, L., Ntanos, C., & Askounis, D. (2023). Missing value imputation methods for electronic health records. IEEE Access, 11, 21562-21574.
[20] Qolomany, B., Al-Fuqaha, A., Gupta, A., Benhaddou, D., Alwajidi, S., Qadir, J., & Fong, A. C. (2019). Leveraging machine learning and big data for smart buildings: A comprehensive survey. IEEE access, 7, 90316-90356.
[21] Michael, C. I., Ipede, O. J., Adejumo, A. D., Adenekan, I. O., Adebayo Damilola, O. A., & Ayodele, P. A. (2024). Data-driven decision making in IT: Leveraging AI and data science for business intelligence. World Journal of Advanced Research and Reviews, 23(01), 432-439.
[22] Iqbal, R., Doctor, F., More, B., Mahmud, S., & Yousuf, U. (2020). Big data analytics: Computational intelligence techniques and application areas. Technological Forecasting and Social Change, 153, 119253.
[23] Hariri, R. H., Fredericks, E. M., & Bowers, K. M. (2019). Uncertainty in big data analytics: survey, opportunities, and challenges. Journal of Big data, 6(1), 1-16.
[24] Daniel, B. K. (2019). Big Data and data science: A critical review of issues for educational research. British Journal of Educational Technology, 50(1), 101-113.
[25] Mirza, B., Wang, W., Wang, J., Choi, H., Chung, N. C., & Ping, P. (2019). Machine learning and integrative analysis of biomedical big data. Genes, 10(2), 87.
[26] Li, W., Chai, Y., Khan, F., Jan, S. R. U., Verma, S., Menon, V. G., ... & Li, X. (2021). A comprehensive survey on machine learning-based big data analytics for IoT-enabled smart healthcare system. Mobile networks and applications, 26, 234-252.
[27] Amanullah, M. A., Habeeb, R. A. A., Nasaruddin, F. H., Gani, A., Ahmed, E., Nainar, A. S. M., ... & Imran, M. (2020). Deep learning and big data technologies for IoT security. Computer Communications, 151, 495-517.
[28] Roh, Y., Heo, G., & Whang, S. E. (2019). A survey on data collection for machine learning: a big data-ai integration perspective. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1328-1347.
[29] Deepa, N., Pham, Q. V., Nguyen, D. C., Bhattacharya, S., Prabadevi, B., Gadekallu, T. R., ... & Pathirana, P. N. (2022). A survey on blockchain for big data: Approaches, opportunities, and future directions. Future Generation Computer Systems, 131, 209-226.
[30] Hajjaji, Y., Boulila, W., Farah, I. R., Romdhani, I., & Hussain, A. (2021). Big data and IoT-based applications in smart environments: A systematic review. Computer Science Review, 39, 100318.
[31] Hossain, E., Khan, I., Un-Noor, F., Sikander, S. S., & Sunny, M. S. H. (2019). Application of big data and machine learning in smart grid, and associated security concerns: A review. Ieee Access, 7, 13960-13988.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Computational and Experimental Science and Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.