We use cookies to improve your experience with our site.

使用网格投影的连续子空间概率skyline查询

Continuous Probabilistic Subspace Skyline Query Processing Using Grid Projections

  • 摘要: 作为多维数据偏好查询的重要类型,skyline查询可以在不给定基于被关注属性的线性组合函数的情况下找到优化的查询结果。Skyline查询已经被广泛地研究过。其中,大部分算法是基于全空间静态确定数据的假设。近年来,某些较领先的工作开始取消这些过强的假设,使得skyline查询能够更贴近现实应用,其中包括:处理多值属性(如:不确定数据),用户自定义子空间查询,基于数据流的连续查询等等。很自然,某些复杂应用可能同时存在上述三种情况。本文着重解决在滑动窗口中不确定数据的子空间skyline查询(PSS)。也就是说,从数据流的最新窗口根据用户定义的子空间和概率阈值,找出满足条件的对象。基于全空间和子空间的微妙关系,本文提出了一种使用网格索引技术的全新方法。多种不同条件下的实验数据证明了PSS的有效性和高效性。

     

    Abstract: As an important type of multidimensional preference query, the skyline query can find a superset of optimal results when there is no given linear function to combine values for all attributes of interest. Its processing has been extensively investigated in the past. While most skyline query processing algorithms are designed based on the assumption that query processing is done for all attributes in a static dataset with deterministic attribute values, some advanced work has been done recently to remove part of such a strong assumption in order to process skyline queries for real-life applications, namely, to deal with data with multi-valued attributes (known as data uncertainty), to support skyline queries in a subspace which is a subset of attributes selected by the user, and to support continuous queries on streaming data. Naturally, there are many application scenarios where these three complex issues must be considered together. In this paper, we tackle the problem of probabilistic subspace skyline query processing over sliding windows on uncertain data streams. That is, to retrieve all objects from the most recent window of streaming data in a user-selected subspace with a skyline probability no smaller than a given threshold. Based on the subtle relationship between the full space and an arbitrary subspace, a novel approach using a regular grid indexing structure is developed for this problem. An extensive empirical study under various settings is conducted to show the effectiveness and effciency of our PSS algorithm.

     

/

返回文章
返回