A Benchmark-Driven Approach to Detecting and Fixing Performance Regressions
DOI:
https://doi.org/10.22399/ijcesen.4125Keywords:
benchmark-driven, performance regression, software optimization, predictive modellingAbstract
Performance regressions can be significant obstacles in software systems, resulting in reduced performance, affect elements of the user experience, affect quality of service or reliability levels for subsequent versions of software applications. A benchmark-based approach is a systematic and objective way to identify and fix these regressions, with a specific benchmarking process that is based on standard tests, results, and metrics to measure and report software performance. This article reviews benchmark-based models, applications, and effectiveness in the contexts of software optimization, machine learning, industrial automation systems, and financial risk management. The article describes how benchmarks can help discover hidden performance regressions, enable more accurate predictive modeling, and facilitate targeted performance optimization opportunities. The discussion presents qualitative and quantitative frameworks for implementation and adaption of benchmark-based approaches to regression in multiple software domains, illustrating the flexibility and scalability of benchmark-based approaches to early identification and mitigation of performance regressions. The article concludes with reflections on the challenges of benchmark processes including design, evaluating with multiple metrics, and variability of implementation, to summarize meta-discoveries about how software systems can prevent negative regressions to create and maintain performance and reliability in software that is complex and evolving.
References
[1] Damasceno Costa, D. E. (2019). Benchmark-driven Software Performance Optimization (Doctoral dissertation).
[2] Wu, Z., Bulathwela, S., Perez-Ortiz, M., & Koshiyama, A. S. (2024). Stereotype Detection in LLMs: A Multiclass, Explainable, and Benchmark-Driven Approach. arXiv preprint arXiv:2404.01768.
[3] Gadre, S. Y., Ilharco, G., Fang, A., Hayase, J., Smyrnis, G., Nguyen, T., ... & Schmidt, L. (2023). Datacomp: In search of the next generation of multimodal datasets. Advances in Neural Information Processing Systems, 36, 27092-27112.
[4] Yağcı, M. (2024). Control Performance Analysis with Industrial Scale Applications.
[5] Kümmerer, M., & Bethge, M. (2023). Predicting visual fixations. Annual Review of Vision Science, 9(1), 269-291.
[6] Makarem, N., Hussainey, K., & Zalata, A. (2018). Earnings management in the aftermath of the zero-earnings discontinuity disappearance. Journal of Applied Accounting Research, 19(3), 401-422.
[7] Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Janoos, F., Rudolph, L., & Madry, A. (2020). Implementation matters in deep policy gradients: A case study on ppo and trpo. arXiv preprint arXiv:2005.12729.
[8] Hsu, C. H., & Kremer, U. (1998). A framework for qualitative performance prediction. Rutgers University.
[9] Engstrom, L., Ilyas, A., Santurkar, S., Tsipras, D., Janoos, F., Rudolph, L., & Madry, A. (2019, September). Implementation matters in deep rl: A case study on ppo and trpo. In International conference on learning representations.
[10] Meng, J. (2025). Benchmark Index Inclusion and Sovereign Risk.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Computational and Experimental Science and Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.