Adaptive Knowledge-Guided Pruning Algorithm AKGP with Dynamic Weight Allocation for Model Compression

Rajesh Thammuluri; Ramesh Babu Mallela; Gottala Surendra Kumar; Bellamgubba Anoch; Veera V. Rama Rao M.; Anuj Rapaka

doi:10.22399/ijcesen.944

Authors

Rajesh Thammuluri Shri Vishnu Engineering College for Women
Ramesh Babu Mallela
Gottala Surendra Kumar
Bellamgubba Anoch
Veera V. Rama Rao M.
Anuj Rapaka

DOI:

https://doi.org/10.22399/ijcesen.944

Keywords:

Deep learning, model compression, dynamic weight allocation, knowledge distillation, network pruning, quantization

Abstract

In this paper, we propose the Adaptive Knowledge-Guided Pruning Algorithm (AKGP), a novel approach to model compression that enhances traditional pruning by incorporating a dynamic, data-driven weight allocation strategy during knowledge distillation. Unlike existing methods, such as the Geometric Median-based pruning approach combined with knowledge distillation and quantization proposed. AKGP dynamically balances the influence of teacher networks and real labels based on dataset characteristics. This adaptive strategy ensures that pruned models achieve superior accuracy even at high compression rates, while significantly reducing model size and computational complexity. Experimental results on the CIFAR-10 dataset demonstrate that AKGP achieves a model accuracy of 94% for ResNet 32 under a 50% pruning ratio, surpassing the baseline and previous methods. This improvement opens new possibilities for deploying deep learning models on resource-constrained devices such as mobile and embedded platforms.

References

M. Zhao et al. (2022), A Novel Deep Learning Model Compression Algorithm. In: Electronics 11;1066. DOI: 10.3390/electronics11071066. URL: https://www. mdpi.com/1999-5903/11/7/1066.

Kaiming He et al. (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 770-778.

Gao Huang et al. (2017). Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4700-4708.

Ze Liu et al. (2021). Swin Transformer V2: Scaling Up Capacity and Resolution. In: arXiv 2111.09883.

Yann LeCun, John S. Denker, and Sara A. Solla. (1990). Optimal brain damage. In: Advances in Neural Information Processing Systems 2;598-605.

Sajid Anwar, Kyuyeon Hwang, and Wonyong Sung. (2017). Structured pruning of deep convolutional neural networks. In: ACM Journal on Emerging Technologies in Computing Systems 13(3);1-18. DOI: 10.1145/ 3065386.

Yang He et al. (2019). Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 14981507.

Benoit Jacob et al. (2018). Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp. 27042713.

Geoffrey E. Hinton and Ruslan R. Salakhutdinov. (2006) Reducing the dimensionality of data with neural networks. Science 313;504-507. DOI: 10.1126/ science. 1127647.

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. (2012). Imagenet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems. 25;1097-1105.

Kang Wang et al. (2019). HAQ: Hardware-aware automated quantization with mixed precision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8612-8620.

Yinan Liu, Wei Zhang, and Jian Wang. (2020). Adaptive multi-teacher multi-level knowledge distillation. Neurocomputing 415;106-113. DOI: 10.1016/j.neucom.2020.07.048.

Ming Lin et al. (2020). HRank: Filter pruning using high-rank feature map. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 15291538.

Feng Zhu et al. (2020). Towards unified int8 training for convolutional neural network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1969-1979.

Song Han, Huizi Mao, and William J. Dally. (2015). Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. In: arXiv eprint: 1510.00149.

Liangchen Zhang et al. (2019). Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision pp. 3713-3722.

Adaptive Knowledge-Guided Pruning Algorithm AKGP with Dynamic Weight Allocation for Model Compression

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

Make a Submission

Information

Keywords

Announcements

Current Issue