Leveraging Large Language Models for Search Relevance Measurement in E-Commerce
DOI:
https://doi.org/10.22399/ijcesen.4818Keywords:
Search Relevance Measurement, Large Language Models, E-Commerce Search Quality, Chain-Of-Thought Prompting, Automated Relevance AssessmentAbstract
Search relevance measurement represents a critical challenge in e-commerce systems, where the accuracy of query-item matching directly impacts user experience, conversion rates, and platform trust. Traditional approaches to measuring relevance have relied heavily on human annotation and behavioral signals, both of which present significant limitations in scalability, cost, and accuracy. Human evaluation suffers from inter-rater variability, limited coverage of long-tail queries, and prohibitive resource requirements when applied to large product catalogs. Implicit feedback signals such as clicks and conversions introduce noise through position bias and popularity effects, often failing to reflect true relevance. Recent advances in large language models offer a transformative alternative by leveraging semantic understanding, contextual reasoning, and world knowledge to assess query-item relationships at scale. Through careful prompt engineering, chain-of-thought reasoning, and validation against human-labeled datasets, these models can generate reliable relevance judgments that approximate or exceed human performance while covering vastly larger evaluation spaces. Implementation strategies, including teacher-student architectures, active learning for edge cases, and periodic human audits, enable organizations to balance accuracy with operational efficiency. This article addresses fundamental trade-offs between coverage, quality, and cost that have long constrained traditional relevance measurement methodologies, enabling more sophisticated search systems that genuinely understand and respond to user intent rather than merely optimizing for engagement metrics.
References
[1] Imede Saidi et al., "Entities recommendations using contextual information," ResearchGate, August 2024. Available: https://www.researchgate.net/publication/382776765_Entities_recommendations_using_contextual_information
[2] Peter Bailey et al., "Relevance assessment: are judges exchangeable and does it matter," ResearchGate, July 2008. Available: https://www.researchgate.net/publication/221301256_Relevance_assessment_are_judges_exchangeable_and_does_it_matter
[3] Joachims et al., "Accurately interpreting clickthrough data as implicit feedback," ResearchGate, August 2005. Available: https://www.researchgate.net/publication/200110530_Accurately_interpreting_clickthrough_data_as_implicit_feedback
[4] Adam Wasilewski, "Harnessing generative AI for personalized E-commerce product descriptions: A framework and practical insights," ScienceDirect, August 2025. Available: https://www.sciencedirect.com/science/article/abs/pii/S0920548925000418
[5] Saba Shaukat et al., "Using TREC for developing a semantic information retrieval benchmark for Urdu," ScienceDirect, May 2022. Available: https://www.sciencedirect.com/science/article/abs/pii/S0306457322000619
[6] Hang Li et al., "Learning to rank for information retrieval LR4IR 2009," ResearchGate, December 2009. Available: https://www.researchgate.net/publication/220466684_Learning_to_rank_for_information_retrieval_LR4IR_2009
[7] Omar Alonso et al., "Using crowdsourcing for TREC relevance assessment," ScienceDirect, November 2012. Available: https://www.sciencedirect.com/science/article/abs/pii/S0306457312000052
[8] Matteo Palmanori et al., " Aggregated search of data and services," ScienceDirect, April 2011. Available: https://www.sciencedirect.com/science/article/abs/pii/S0306437910000979
[9] Jason Wei et al., "Chain of Thought Prompting Elicits Reasoning in Large Language Models," ResearchGate, January 2022. Available: https://www.researchgate.net/publication/358232899_Chain_of_Thought_Prompting_Elicits_Reasoning_in_Large_Language_Models
[10] Geoffrey Hinton et al., "Distilling the Knowledge in a Neural Network," ResearchGate, March 2015. Available: https://www.researchgate.net/publication/273387909_Distilling_the_Knowledge_in_a_Neural_Network
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Computational and Experimental Science and Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.