Arabic text classification using graphs and deep learning
DOI:
https://doi.org/10.22399/ijcesen.4402Keywords:
Arabic NLP, Text classification, Graph Neural Networks, AraBERT, Deep learningAbstract
This paper proposes a novel approach to Arabic text classification that integrates Graph Convolutional Networks (GCNs) with AraBERT embeddings. Unlike traditional sequence-based methods, our framework constructs document-level graphs where words are represented as nodes and edges encode semantic and co-occurrence relations. AraBERT provides rich contextual embeddings for each node, enabling the GCN to capture both local and global dependencies. Experiments on the SANAD–Khaleej dataset (45,500 news articles across seven balanced categories) show that our model achieves 97.25% accuracy, 97.26% macro-F1, and 97.27% recall, significantly outperforming baseline models such as CNNs (95.89% accuracy) and LSTMs (95.23% accuracy). The results confirm the effectiveness of combining graph-based architectures with pre-trained language models for morphologically rich languages such as Arabic, and demonstrates scalability for large-scale text processing
References
[1]Kipf, T. N., & Welling, M. (2017). Semi-Supervised Classification with Graph Convolutional Networks (No. arXiv:1609.02907). arXiv. https://doi.org/10.48550/arXiv.1609.02907
[2]Yao, L., Mao, C., & Luo, Y. (2019). Graph Convolutional Networks for Text Classification. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 7370–7377. https://doi.org/10.1609/aaai.v33i01.33017370
[3]Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., & Bengio, Y. (2018). Graph Attention Networks (No. arXiv:1710.10903). arXiv. https://doi.org/10.48550/arXiv.1710.10903
[4]Antoun, W., Baly, F., & Hajj, H. (2021). AraBERT: Transformer-based Model for Arabic Language Understanding (No. arXiv:2003.00104). arXiv. https://doi.org/10.48550/arXiv.2003.00104
[5]Elnagar, A., Al-Debsi, R., & Einea, O. (2020). Arabic text classification using deep learning models. Information Processing & Management, 57(1), 102121. https://doi.org/10.1016/j.ipm.2019.102121
[6]Sundus, K., Al-Haj, F., & Hammo, B. (2019). A Deep Learning Approach for Arabic Text Classification. 2019 2nd International Conference on New Trends in Computing Sciences (ICTCS), 1–7. https://doi.org/10.1109/ICTCS.2019.8923083
[7]El Rifai, H., Al Qadi, L., & Elnagar, A. (2022). Arabic text classification: The need for multi-labeling systems. Neural Computing and Applications, 34(2), 1135–1159. https://doi.org/10.1007/s00521-021-06390-z
[8]Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space (No. arXiv:1301.3781). arXiv. https://doi.org/10.48550/arXiv.1301.3781
[9]Pennington, J., Socher, R., & Manning, C. (2014). GloVe: Global Vectors for Word Representation. In A. Moschitti, B. Pang, & W. Daelemans (Eds.), Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1532–1543). Association for Computational Linguistics. https://doi.org/10.3115/v1/D14-1162
[10]Sabri, T., Beggar, O. E., & Kissi, M. (2022). Comparative study of Arabic text classification using feature vectorization methods. Procedia Computer Science, 198, 269–275. https://doi.org/10.1016/j.procs.2021.12.239
[11]Aggarwal, C. C. (2023). Neural Networks and Deep Learning: A Textbook. Springer International Publishing. https://doi.org/10.1007/978-3-031-29642-0
[12]Elnagar, A., Omar Einea, & Al-Debsi, R. (2019). Automatic Text Tagging of Arabic News Articles Using Ensemble Deep Learning Models. In M. Abbas & A. A. Freihat (Eds.), Proceedings of the 3rd International Conference on Natural Language and Speech Processing (pp. 59–66). Association for Computational Linguistics. https://aclanthology.org/W19-7409/
[13]Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In J. Burstein, C. Doran, & T. Solorio (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 4171–4186). Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1423
[14]Labonne, M. (2023). Hands-on graph neural networks using Python. Packt Publishing Birmingham, UK.
[15]Hamilton, W. L. (2020). Graph representation learning. Morgan & Claypool Publishers
[16]Pal, A., Selvakumar, M., & Sankarasubbu, M. (2020). Multi-Label Text Classification using Attention-based Graph Neural Network. Proceedings of the 12th International Conference on Agents and Artificial Intelligence, 494–505. https://doi.org/10.5220/0008940304940505
[17]Powers, D. M. W. (2020). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation (No. arXiv:2010.16061). arXiv. https://doi.org/10.48550/arXiv.2010.16061
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Computational and Experimental Science and Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.