We use cookies to improve your experience with our site.

基于反Top-k地理社交关键字查询的Non-Answer问题研究

Answering Non-Answer Questions on Reverse Top-k Geo-Social Keyword Queries

  • 摘要: 1.研究背景(Context):
    反Top-k地理社交网络关键字查询(RkGSK)旨在寻找与给定兴趣点(Point-of-Interest)空间距离相近、文本内容相似以及社交关系相关的用户,近年来受到了工业界和学术界的广泛关注。然而,用户在构造查询时的查询条件可能过分严苛导致最终查询结果为空,且用户想知道如何修改原始查询以获取期望数目的查询结果。因此,本文针对反Top-k地理社交关键字查询的Non-Answer问题(NARGSK)展开研究,给定一个结果为空的RkGSK查询,研究如何在修改代价最小的情况下返回满足用户期望结果的精炼查询。
    2.目的(Objective):
    反Top-k地理社交网络关键字查询(RkGSK)具有广泛的研究和应用价值。但在实际应用中,查询返回结果可能为空而不符合用户的预期结果。因此,针对RkGSK中存在的Non-Answer问题本文提出基于查询松弛的解决方案。此外,为了快速修改查询以最小化修改代价,本文提出了基于剪枝策略的优化方法以进一步提升算法效率。
    3.方法(Method):
    本文通过扩展现有方法提出了基准算法和基于剪枝策略的优化算法,以解决反Top-k地理社交关键字查询(RkGSK)上的Non-Answer问题。首先,利用Top-k地理社交关键字查询(TkGSK)生成一个候选用户集,基于该用户集修改原始查询参数和数据以更好地描述用户的查询意图。其次,通过扩大查询范围和修改关键词来获得所有的候选查询,之后调用最先进的RkGSK查询算法在候选查询中找到修改代价最小的最佳精炼查询。最后,为了提高基准算法的效率,提出了两种剪枝策略以避免算法中不必要的计算。值得注意的是,优化算法基于过滤和精炼框架并结合剪枝技术来提升算法性能。
    4.结果(Result & Findings):
    针对论文提出的基准算法和优化算法开展实验,实验数据来自真实地理空间数据集,即Brightkite和Gowalla。大量实验结果表明,本文提出的方法可有效解决反Top-k地理社交关键字查询上的Non-Answer问题,能够以最小的修改代价返回满足用户期望结果数目的精炼查询。特别值得说明的是,与基准算法相比,本文提出的优化算法性能平均提升1~2倍,时间代价显著降低。这进一步证明本文提出的解决方法的有效性和高效性。
    5.结论(Conclusions):
    本文探讨了反Top-k地理社交关键字查询(RkGSK)上的Non-Answer问题,研究如何以最小的修改代价为用户提供更准确描述其查询意图的精炼查询(NARGSK)。基于此,提出了基准算法和结合剪枝策略的优化算法。大量的实验结果表明本文提出的方法均可有效解决NARGSK问题。此外,优化算法可进一步提高算法性能,降低时间代价。今后,作者将进一步深入探索RkGSK查询中的Why-not问题。

     

    Abstract: Due to the wide-spread use of geo-positioning technologies and geo-social networks, the reversetop-k geo-socialkeyword query has attracted considerable attention from both industry and research communities. A reverse top-k geo- social keyword(RkGSK) query findsthe users who are spatially near, textually similar,and socially relevantto a specified point of interest. RkGSK queriesare useful in many real-life applications. For example,they can helpthe query issuer identify potential customers in marketing decisions. However, thequery constraints couldbe too strict sometimes, making it hard to find any result for the RkGSK query. The query issuersmay wonder how to modify their originalqueries to get a certainnumber of query results. In this paper,we study non-answer questionson reverse top-k geo-social keywordqueries (NARGSK). Given an RkGSK queryand the requirednumber M of queryresults, NARGSK aim to find the refined RkGSK queryhaving M users in its resultset. To efficiently answerNARGSK, we propose two algorithms (ERQ and NRG) based on query relaxation. As this is the firstwork to addressNARGSK to the best of our knowledge, ERQ is the baselineextended from the state-of-the-art method,while NRG furtherimproves the efficiency of ERQ. Extensive experiments usingreal-life datasets demonstrate the efficiency of our proposedalgorithms, and theperformance of NRGis improved by a factorof 1–2 on average compared with ERQ.

     

/

返回文章
返回