From Lexical to Semantic: A Systematic Review of Off-Topic Detection Evolution in Automated Assessment

Authors

  • Hanadi Hamed AbdelRahman Computer science department, College of Computer Science and Information Technology, Basrah, Iraq
  • Salma A. Mahmood

DOI:

https://doi.org/10.22399/ijcesen.3382

Keywords:

Lexical to Semantic, off-Topic Detection Evolution, Automated Assessment

Abstract

Off-topic detection is a vital challenge in natural language processing, especially as online learning, automated assessments, and AI-powered content platforms continue to expand – it ensures that learner responses or user-generated content align with intended prompts or topics Its applications span far beyond essay scoring to include spoken response assessment, educational dialogue systems, business writing analysis, and even harmful content moderation on social platforms This research aims to develop and enhance an off-topic detection model using AraBERT embeddings for Arabic student responses, focusing on boosting automatic assessment accuracy and elevating the overall robustness of educational AI systems. The findings demonstrate that our model reliably identifies relevance in short-answer tasks, achieving precision and recall comparable to or better than existing methods, suggesting that future work integrating deeper semantic, structural, and discourse-level features may further enhance assessment capabilities. [1], [2], [3], [4]

This review aims to survey a range of studies in this domain, identifying the methods employed, assessing their significance, exploring the limitations they face, and highlighting emerging directions in the evolution of off-topic detection techniques within automated assessment systems.

References

[1] A. Shahzad and A. Wali, (2022). Computerization of Off-Topic Essay Detection: A possibility?,” Educ. Inf. Technol., vol. 27(4), 5737–5747, doi: 10.1007/s10639-021-10863-y.

[2] V. Raina, M. J. Gales, and K. Knill, (2020). Complementary systems for off-topic spoken response detection, Association for Computational Linguistics. https://www.repository.cam.ac.uk/items/c3174b46-6fb1-4b11-a10f-caeaac8d4e91

[3] Y. Zhu, (2021). Off-Topic Detection of Business English Essay Based on Deep Learning Model, Mob. Inf. Syst., vol. 1–9, doi: 10.1155/2021/5051667.

[4] V. U. Gongane, M. V. Munot, and A. D. Anuse, (2022). Detection and moderation of detrimental content on social media platforms: current status and future directions, Soc. Netw. Anal. Min., vol. 12(1), 129, doi: 10.1007/s13278-022-00951-3.

[5] C. Fan, S. Guo, A. Wumaier, and J. Liu, (2023). A cross-attention and Siamese network based model for off-topic detection, in 2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI), IEEE, 770–777. https://ieeexplore.ieee.org/abstract/document/10356596/

[6] G. Y. Zhao, (2024). An NLP-Based Knowledge Extraction Approach for IT Tech-Support/Helpdesk Transcripts, https://scholar.dsu.edu/theses/444/

[7] X. Wang, S.-Y. Yoon, K. Evanini, K. Zechner, and Y. Qian, (2019). Automatic Detection of Off-Topic Spoken Responses Using Very Deep Convolutional Neural Networks., in INTERSPEECH, 4200–4204. https://www.isca-archive.org/interspeech_2019/wang19p_interspeech.pdf

[8] A. Shahzad and A. Wali, (2022). Computerization of Off-Topic Essay Detection: A possibility?, Educ. Inf. Technol., vol. 27(4), 5737–5747, doi: 10.1007/s10639-021-10863-y.

[9] N. Goharian and A. Platt, (2007). DOTS: Detection of Off-Topic Search via Result Clustering, in 2007 IEEE Intelligence and Security Informatics, IEEE, 145–151. https://ieeexplore.ieee.org/abstract/document/4258688/

[10] S. D. Das, Y. Vadi, and K. Yadav, (2024). Transformer-based Joint Modelling for Automatic Essay Scoring and Off-Topic Detection, arXiv:2404.08655. doi: 10.48550/arXiv.2404.08655.

[11] X. Li, Q. Wen, and K. Pan, (2017). Unsupervised off-topic essay detection based on target and reference prompts, in 2017 13th International Conference on Computational Intelligence and Security (CIS), IEEE, 465–468. https://ieeexplore.ieee.org/abstract/document/8288530/

Downloads

Published

2025-07-18

How to Cite

AbdelRahman, H. H., & Salma A. Mahmood. (2025). From Lexical to Semantic: A Systematic Review of Off-Topic Detection Evolution in Automated Assessment. International Journal of Computational and Experimental Science and Engineering, 11(3). https://doi.org/10.22399/ijcesen.3382

Issue

Section

Research Article