The effect of some normalization methods on neural networks and robust methods with the presence of outliers
DOI:
https://doi.org/10.22399/ijcesen.1716Keywords:
Neural Networks, Recurrent Neural Network, Min-Max, Z-score, Robust MethodsAbstract
This research aims to use some statistical transformations such as Z-Score and MIN_MAX to see how these transformations affect the performance of some robust methods and neural networks when there are outliers in the data and to compare the robust methods (LTS, MCD, and MM) and some neural networks including (RNN, GRU, and LSTM) with different activation functions represented by (Relu, Elu, and Selu).The research sample included 3000 private sector electricity generators in Iraq for the year 2021 taken from the Central Statistical Organization.The comparison was made using the mean square error (MSE) and using the statistical programs Python, R, and Excel.The results showed that both normalization techniques significantly improved the performance of the models, especially Min-Max normalization is the best for all (robust methods and neural networks) and especially the superiority of neural networks RNN, especially with deeper structures, showing good performance across different activation functions. As for the robust methods, the MM method was consistently the best, giving the lowest mean square error across all normalization techniques.
References
[1] Alma, Ö. (2011). Comparison of Robust Regression Methods in Linear Regression. International Journal of Contemporary Mathematical Sciences. 6(9);409-421. https://avesis.deu.edu.tr/dosya?id=23aefcae-6b05-402d-89e8-ff6a8da2fc2b
[2] ArunKumar, K. E., Kalaga, D. V., Kumar, C. M., Kawaji, M., & Brenza, T. M. (2022). Comparative analysis of Gated Recurrent Units (GRU), long Short-Term memory (LSTM) cells, autoregressive integrated moving average (ARIMA), seasonal autoregressive integrated moving average (SARIMA) for forecasting COVID-19 trends. Alexandria Engineering Journal. 61;7585-7603. https://doi.org/10.1016/j.aej.2022.01.011
[3] Begashaw, G. B., & Yohannes, Y. B. (2020). Review of Outlier Detection and Identifying Using Robust Regression Model. International Journal of Systems Science and Applied Mathematics. 5(1);4-11. http://dx.doi.org/10.11648/j.ijssam.20200501.12
[4] Bouktif, S., Fiaz, A., Ouni, A., & Serhani, M. A. (2018). Optimal Deep Learning LSTM Model for Electric Load Forecasting using Feature Selection and Genetic Algorithm: Comparison with Machine Learning Approaches. Energies. 11(1636);1-20. https://doi.org/10.3390/en11071636
[5] Chen, G. (2018). A Gentle Tutorial of Recurrent Neural Network with Error Backpropagation. arXiv. https://arxiv.org/pdf/1610.02583.pdf
[6] Chicco, D., Warrens, M., & Jurman, G. (2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science. 7(3);e623. http://dx.doi.org/10.7717/peerj-cs.623
[7] Cho, K., Merriënboer, B. V., & Bahdanau, D. (2014). On the Properties of Neural Machine Translation: Encoder–Decoder Approaches. arXiv:1409.1259v2. https://doi.org/10.3115/v1/W14-4012
[8] Čížek, P., & Víšek, J. Á. (2000). Least trimmed squares. SFB373 Discussion Paper. No. 2000,53. https://hdl.handle.net/10419/62211
[9] Fan, C., Chen, M., Wang, X., Wang, J., & Huang, B. (2021). A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data. Frontiers in Energy Research. 9;652801. https://doi.org/10.3389/fenrg.2021.652801
[10] Feng, J., & Lu, S. (2019). Performance Analysis of Various Activation Functions in Artificial Neural Networks. Journal of Physics: Conference Series. 1237;022030. https://doi.org/10.1088/1742-6596/1237/2/022030
[11] Henderi, Wahyuningsih, T., & Rahwanto, E. (2021). Comparison of Min-Max normalization and Z-Score Normalization in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer. International Journal of Informatics and Information System. 4(1);13-20. http://dx.doi.org/10.47738/ijiis.v4i1.73
[12] Houdt, G. V., Mosquera, C., & Napoles, G. (2020). A Review on the Long Short-Term Memory Model. Artificial Intelligence Review. 53;5929-5955. https://doi.org/10.1007/s10462-020-09838-1
[13] Hubert, M., Debruyne, M., & Rousseeuw, P. J. (2017). Minimum Covariance Determinant and Extensions. WIREs Computational Statistics. 2017;wics.1421. https://doi.org/10.1002/wics.1421
[14] Irshayyid, A. J., & Saleh, R. A. (2023). Robust estimates for a three-parameter exponential regression model. Nonlinear Analysis and Applications. 14(1);2799-2808. http://dx.doi.org/10.22075/ijnaa.2023.29395.4148
[15] Jawad, H. T., & Saleh, R. (2024). Estimation of the Regression Model Using M-Estimation Method and Artificial Neural Networks in the Presence of Outliers. Journal of Economics and Administrative Sciences. 30(140);688-716. https://doi.org/10.33095/g4hems75
[16] Jurafsky, D., & Martin, J. H. (2024). Speech and Language Processing. Third Edition draft. https://web.stanford.edu/~jurafsky/slp3/ed3bookfeb3_2024.pdf
[17] Kappal, S. (2019). Data Normalization using Median & Median Absolute Deviation (MMAD) based Z-Score for Robust Predictions vs. Min – Max Normalization. London Journal of Research in Science: Natural and Formal. 19(4);39-44.
[18] Bahez, Z. K., & Rasheed, H. A. (2022). Comparing Some of Robust the Non-Parametric Methods for Semi-Parametric Regression Models Estimation. Journal of Economics and Administrative Sciences. 28(132);105-117. https://doi.org/10.33095/jeas.v28i132.2275
[19] Kılıçarslan, S., Adem, K., & Çelik, M. (2021). An overview of the activation functions used in deep learning algorithms. Journal of New Results in Science. 10(3);75-88. https://doi.org/10.54187/jnrs.1011739
[20] Li, C. (2019). Preprocessing Methods and Pipelines of Data Mining: An Overview. Machine Learning. https://doi.org/10.48550/arXiv.1906.08510
[21] Mahdi, M. H., & Hussein, S. M. (2023). Estimating the Population Mean in Stratified Random Sampling Using Combined Regression with the Presence of Outliers. Journal of Economics and Administrative Sciences. 29(136);70-80.
[22] Mateus, B. C., Mendes, M., Farinha, J. T., Assis, R., & Cardoso, A. M. (2021). Comparing LSTM and GRU Models to Predict the Condition of a Pulp Paper Press. Energies. 14(6958). https://doi.org/10.3390/en14216958
[23] Nugrahani, I., Susanti, Y., & Qona'ah, N. (2021). Modeling of Rice Production in Indonesia Using Robust Regression with The Method of Moments (MM) Estimation. Basic and Applied Science Conference (BASC) 2021. NST Proceedings;79-87. https://doi.org/10.11594/nstp.2021.1111
[24] Nwankpa, C. E., Ijomah, W., Gachagan, A., & Marshall, S. (2021). Activation functions: comparison of trends in practice and research for deep learning. 2nd International Conference on Computational Sciences and Technology. 124-133. https://doi.org/10.48550/arXiv.1811.03378
[25] Panigrahi, S., & Behera, H. S. (2013). Effect of Normalization Techniques on Univariate Time Series Forecasting using Evolutionary Higher Order Neural Network. International Journal of Engineering and Advanced Technology. 3(2);280-285.
[26] Rahayu, D. A., Nursholihah, U. F., & Suryaputra, G. (2023). Comparasion of The M, MM and S Estimator in Robust Regression Analysis on Indonesian Literacy Index Data 2018. EKSAKTA Journal of Sciences and Data Analysis. 4(1);11-22. https://doi.org/10.20885/EKSAKTA.vol4.iss1.art2
[27] Saleh, R. A., & Salman, M. J. (2022). Comparison of some artificial neural networks for graduate students. Periodicals of Engineering and Natural Sciences Original Research. 10(3);187-196. https://doi.org/10.21533/pen.v10i3.304
[28] Sharma, S., Sharma, S., & Athaiya, A. (2020). Activation Functions in Neural Networks. International Journal of Engineering Applied Sciences and Technology. 4;310-316. https://doi.org/10.33564/IJEAST.2020.v04i12.054
[29] Shewalkar, A., Nyavanandi, D., & Ludwig, S. A. (2019). Performance Evaluation of Deep neural networks Applied to Speech Recognition: Rnn, LSTM and GRU. Journal of Artificial Intelligence and Soft Computing Research. 9(4);235-245. https://doi.org/10.2478/jaiscr-2019-0006
[30] Shiri, F. M., Perumal, T., Mustapha, N., & Mohamed, R. (2023). A Comprehensive Overview and Comparative Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU. arXiv:2305.17473. https://doi.org/10.48550/arXiv.2305.17473
[31] Silva, I. N., Spatti, D. H., Flauzino, R. A., Liboni, L. H., & Alves, S. F. (2017). Artificial Neural Networks A Practical Course. Springer. https://doi.org/10.1007/978-3-319-43162-8
[32] Tatachar, A. V. (2021). Comparative Assessment of Regression Models Based On Model Evaluation Metrics. International Research Journal of Engineering and Technology. 8(9);853-860.
[33] Zarzycki, K., & Ławryńczuk, M. (2021). LSTM and GRU Neural Networks as Models of Dynamical Processes Used in Predictive Control: A Comparison of Models Developed for Two Chemical Reactors. Sensors. 21(5625). https://doi.org/10.3390/s21165625
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Computational and Experimental Science and Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.