Query Performance Prediction for Information Retrieval Based on Covering Topic Score
-
Abstract
We present a statistical method called Covering Topic Score (CTS) topredict query performance for information retrieval. Estimation is basedon how well the topic of a user's query is covered by documentsretrieved from a certain retrieval system. Our approach is conceptuallysimple and intuitive, and can be easily extended to incorporate featuresbeyond bag-of-words such as phrases and proximity of terms. Experimentsdemonstrate that CTS significantly correlates with query performance ina variety of TREC test collections, and in particular CTS gains moreprediction power benefiting from features of phrases and proximity ofterms. We compare CTS with previous state-of-the-art methods for queryperformance prediction including clarity score and robustness score. Ourexperimental results show that CTS consistently performs better than, orat least as well as, these other methods. In addition to its higheffectiveness, CTS is also shown to have very low computationalcomplexity, meaning that it can be practical for real applications.
-
-