Optimizing data processing in big data systems using hybrid machine learning techniques
DOI:
https://doi.org/10.22399/ijcesen.936Keywords:
Hybrid Machine Learning, Big Data Processing, Real-Time Analytics, Distributed Systems, Data StreamingAbstract
Big data systems are encountering problems related to the effective implementation of large-scale data and the time taken and the resources needed to execute those data. This analysis shows that complex hierarchies of machine learning algorithms, where multiple models are integrated, have potential for improving the data processing in these systems. This paper provides key ideas of using both the supervised and unsupervised learning technique in order to deal with various types of data and for the enhancement of potential throughput. The proposed methodology exploits parallel processing features so that researchers can obtain real-time results without a significant amount of computation. Numerical results from experiments indicate that the proposed hybrid model has a better performance than the other machine learning models in terms of processing time and model accuracy. Further, the approach provides flexibility in handling the different types of data sources, and therefore can apply to various areas of practice including healthcare, finance and e-commerce. Finally, the paper points out that it is likely to observe high performance and scalability in the next generation of big data systems, particularly where hybrid machine learning models are implemented.
References
R. S. Kumar and A. Patel, (2024). Distributed machine learning for scalable big data processing, IEEE Transactions on Cloud Computing, 13(4);134–145.
Y. Zhang and X. Li, (2024). Hybrid machine learning techniques for big data analytics, IEEE Access, 12;2345–2356.
T. Wang, H. Liu, and Z. Chen, (2024). Real-time analytics with hybrid machine learning in big data systems, Proceedings of the International Conference on Big Data, pp. 238–245.
Q. Yang, Y. Liu, and Z. Zhang, (2024). Scalable machine learning algorithms for big data, Journal of Big Data, 9(2);22–30.
M. Singh, S. Verma, and P. R. Gupta, (2024). Deep learning in big data systems: Challenges and solutions, IEEE Transactions on Big Data, 11(3); 532–543.
V. Gupta, R. Singh, and A. Kumar, (2024). Privacy-preserving machine learning for big data systems, IEEE Transactions on Information Forensics and Security, 19;135–146.
X. Liu, Y. Zhang, and S. Gao, (2024). Resource optimization in cloud-based big data systems using hybrid machine learning, IEEE Transactions on Cloud Computing, 13(5);122–132.
L. J. Tang and M. S. Chen, (2024). Efficient hybrid data processing models for large-scale machine learning, International Journal of Data Science and Analytics, 10(1);87–99,
C. Li, F. Zhang, and X. Guo, (2024). Scalable hybrid algorithms for distributed machine learning in big data systems, ACM Computing Surveys, 56(4);45–59.
J. Xie, Q. Li, and J. Wei, (2024). A comprehensive approach for hybrid machine learning in cloud computing for big data, Springer Journal of Cloud Computing, 8;156–170,
P. Reddy, S. Singh, and K. Sharma, (2024). Parallel processing frameworks for hybrid machine learning, Future Generation Computer Systems, 130;204–216.
A. Chen and M. Zhou, (2024). Integrating Apache Spark with machine learning algorithms for big data processing, IEEE Access, 12;3401–3415.
K. Tan, J. Lu, and T. Wang, (2024). Hybrid supervised and unsupervised machine learning for data streaming, Journal of Parallel and Distributed Computing, 157;12–24.
S. K. Gupta and R. M. Thomas, (2024). Big data security and privacy using hybrid learning models, IEEE Transactions on Information Security and Privacy, 15;311–324.
H. Park, S. Lee, and J. Kim, (2024). Real-time analytics for distributed big data systems, ACM Transactions on Data Science, 10(2);87–99.
D. Kumar and P. Tiwari, (2024). Optimizing machine learning algorithms for big data scalability," IEEE Transactions on Computational Intelligence, 15(5);205–219.
Z. Li, M. Sun, and Y. Zhao, (2024). Real-time processing frameworks for hybrid big data models," Springer Journal of Real-Time Data Science, 9;234–248.
F. Luo and X. Tang, (2024). Streamlining big data analytics using hybrid methodologies, Proceedings of the International Conference on Data Engineering, pp. 132–141.
J. Patel, R. Singh, and A. Desai, (2024). Cloud-based big data processing using hybrid techniques," Journal of Cloud Computing and Data Management, 18(3);102–115.
N. Zhang and L. Wang, (2024). Distributed hybrid models for big data analytics, IEEE Transactions on Parallel and Distributed Systems, 15(6);442–453.
S. Praseetha, & S. Sasipriya. (2024). Adaptive Dual-Layer Resource Allocation for Maximizing Spectral Efficiency in 5G Using Hybrid NOMA-RSMA Techniques. International Journal of Computational and Experimental Science and Engineering, 10(4). https://doi.org/10.22399/ijcesen.665
K.S. Praveenkumar, & R. Gunasundari. (2025). Optimizing Type II Diabetes Prediction Through Hybrid Big Data Analytics and H-SMOTE Tree Methodology. International Journal of Computational and Experimental Science and Engineering, 11(1). https://doi.org/10.22399/ijcesen.727
Vijayadeep GUMMADI, & Naga Malleswara Rao NALLAMOTHU. (2025). Optimizing 3D Brain Tumor Detection with Hybrid Mean Clustering and Ensemble Classifiers. International Journal of Computational and Experimental Science and Engineering, 11(1). https://doi.org/10.22399/ijcesen.719
SHARMA, M., & BENIWAL, S. (2024). Feature Extraction Using Hybrid Approach of VGG19 and GLCM For Optimized Brain Tumor Classification. International Journal of Computational and Experimental Science and Engineering, 10(4). https://doi.org/10.22399/ijcesen.714
I. Prathibha, & D. Leela Rani. (2025). Rainfall Forecasting in India Using Combined Machine Learning Approach and Soft Computing Techniques: A HYBRID MODEL. International Journal of Computational and Experimental Science and Engineering, 11(1). https://doi.org/10.22399/ijcesen.785
Tirumanadham, N. S. K. M. K., S. Thaiyalnayaki, & V. Ganesan. (2025). Towards Smarter E-Learning: Real-Time Analytics and Machine Learning for Personalized Education. International Journal of Computational and Experimental Science and Engineering, 11(1). https://doi.org/10.22399/ijcesen.786
Johnsymol Joy, & Mercy Paul Selvan. (2025). An efficient hybrid Deep Learning-Machine Learning method for diagnosing neurodegenerative disorders. International Journal of Computational and Experimental Science and Engineering, 11(1). https://doi.org/10.22399/ijcesen.701
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 International Journal of Computational and Experimental Science and Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.