We use cookies to improve your experience with our site.
Gabriel H. Tolosa, Pablo Lavallén, Esteban A. Ríssola. Caching Document Identifiers to Speedup Query Processing in Search Servers[J]. Journal of Computer Science and Technology.
Citation: Gabriel H. Tolosa, Pablo Lavallén, Esteban A. Ríssola. Caching Document Identifiers to Speedup Query Processing in Search Servers[J]. Journal of Computer Science and Technology.

Caching Document Identifiers to Speedup Query Processing in Search Servers

  • Modern search systems have become a fundamental tool for accessing the massive amount of information stored in different repositories. These systems use sophisticated techniques to efficiently process a high volume of queries (thus optimising energy consumption). One of these techniques is caching, which is implemented at different levels of a search architecture. In this work, we propose a novel caching strategy that speeds up dynamic pruning techniques (such as Maxscore) by exploiting the information of the lowest (Min) and highest (Max) document identifiers that appear as the result of a previously submitted query. We name this technique as Min/Max caching. The idea is to use Min/Max information for pruning the terms' posting lists in the query before executing the ranking algorithm in a document-at-a-time (DAAT) approach. The proposed technique uses low memory resources, returns safe results, and complements other levels of caching (if present). We also combine the approach with different access policies. Extensive experimentation on real-world data shows that the proposed method increases query processing speedup up to 1.23x and can also reduce high-percentile tail latency (up to 2.0x speedup), an essential requirement for operational scenarios. We evaluate different access and eviction cache policies based on different decision criteria. Our findings confirm that considering the cost of the cached items (cost-aware policies) allows more computation savings.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return