We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Ji-Zhou Luo, Sheng-Fei Shi, Guang Yang, Hong-Zhi Wang, Jian-Zhong Li. O2iJoin: An Efficient Index-Based Algorithm for Overlap Interval Join[J]. Journal of Computer Science and Technology, 2018, 33(5): 1023-1038. DOI: 10.1007/s11390-018-1872-x
Citation: Ji-Zhou Luo, Sheng-Fei Shi, Guang Yang, Hong-Zhi Wang, Jian-Zhong Li. O2iJoin: An Efficient Index-Based Algorithm for Overlap Interval Join[J]. Journal of Computer Science and Technology, 2018, 33(5): 1023-1038. DOI: 10.1007/s11390-018-1872-x

O2iJoin: An Efficient Index-Based Algorithm for Overlap Interval Join

Funds: This work was partially supported by the National Key Research and Development Program of China under Grant No. 2016YFB1000700.
More Information
  • Received Date: June 20, 2017
  • Revised Date: June 17, 2018
  • Published Date: September 16, 2018
  • Time intervals are often associated with tuples to represent their valid time in temporal relations, where overlap join is crucial for various kinds of queries. Many existing overlap join algorithms use indices based on tree structures such as quad-tree, B+-tree and interval tree. These algorithms usually have high CPU cost since deep path traversals are unavoidable, which makes them not so competitive as data-partition or plane-sweep based algorithms. This paper proposes an efficient overlap join algorithm based on a new two-layer flat index named as Overlap Interval Inverted Index (i.e., O2i Index). It uses an array to record the end points of intervals and approximates the nesting structures of intervals via two functions in the first layer, and the second layer uses inverted lists to trace all intervals satisfying the approximated nesting structures. With the help of the new index, the join algorithm only visits the must-be-scanned lists and skips all others. Analyses and experiments on both real and synthetic datasets show that the proposed algorithm is as competitive as the state-of-the-art algorithms.
  • [1]
    Cafagna F, Böhlen M H. Disjoint interval partitioning. The VLDB Journal, 2017, 26(3):447-466.
    [2]
    Bouros P, Mamoulis N. A forward scan based plane sweep algorithm for parallel interval joins. Proceedings of the VLDB Endowment, 2017, 10(11):1346-1357.
    [3]
    Piatov D, Helmer S, Dignös A. An interval join optimized for modern hardware. In Proc. the 32nd IEEE International Conference on Data Engineering, May 2016, pp.1098-1109.
    [4]
    Dignös A, Böhlen M H, Gamper J. Overlap interval partition join. In Proc. the 2014 ACM SIGMOD International Conference on Management of Data, June 2014, pp.1459-1470.
    [5]
    Gao D, Jensen C S, Snodgrass R T, Soo M. Join operations in temporal databases. The VLDB Journal, 2005, 14(1):2-29.
    [6]
    Elmasri R, Wuu G T J, Kim Y J. The time index:An access structure for temporal data. In Proc. the 16th International Conference on Very Large Databases, Aug. 1990, pp.1-12.
    [7]
    Son D, Elmasri R. Efficient temporal join processing using time index. In Proc. the 8th International Conference on Scientific and Statistical Data Base Management, Jun. 1996, pp.252-261.
    [8]
    Soo M D, Snodgrass R T, Jensen C S. Efficient evaluation of the valid-time natural join. In Proc. the 10th International Conference on Data Engineering, Feb. 1994, pp.282-292.
    [9]
    Sitzmann I, Stuckey P J. Improving temporal joins using histograms. In Proc. the 11th International Conference on Database and Expert Systems Applications, Sept. 2000, pp.488-498.
    [10]
    Gunadhi H, Segev A. Query processing algorithms for temporal intersection joins. In Proc. the 7th International Conference on Data Engineering, April 1991, pp.336-344.
    [11]
    Rana S P, Fotouhi F. Efficient processing of time-joins in temporal data bases. In Proc. the 3rd International Conference on Database Systems for Advanced Applications, April 1993, pp.427-432.
    [12]
    Pfser D, Jensen S J. Incremental join of time-oriented data. In Proc. the 11th International Conference on Scientific and Statistical Data Base Management, Jul. 1999, pp.232-243.
    [13]
    Zhang C, Naughton J, DeWitt D, Luo Q, Lohman G. On supporting containment queries in relational database management systems. In Proc. the 2001 ACM SIGMOD International Conference on Management of Data, May 2001, pp.425-436.
    [14]
    Zhang D, Tsotras V J, Seeger B. Efficient temporal join processing using indices. In Proc. the 18th International Conference on Data Engineering, Feb. 2002, pp.103-113.
    [15]
    Kriegel H P, Pötke M, Seidl T. Managing intervals efficiently in object-relational databases. In Proc. the 26th International Conference on Very Large Data Bases, Sept. 2000, pp.407-418.
    [16]
    Tsotras V J, Kangelaris N. The snapshot index:An I/Ooptimal access method for timeslice queries. Information Systems, 1995, 20(3):237-260.
    [17]
    Beckmann N, Seeger B. A revised R-tree in comparison with related index structures. In Proc. the 2009 ACM SIGMOD International Conference on Management of Data, June 2009, pp.799-812.
    [18]
    Ozsoyoglu G, Snodgrass R T. Temporal and real-time databases:A survey. IEEE Transactions on Knowledge and Data Engineering, 1995, 7(4):513-532.
    [19]
    Beckmann N, Kriegel H P, Schneider R, Seeger B. The R-tree:An efficient and robust access method for points and rectangles. In Proc. the 1990 ACM SIGMOD International Conference on Management of Data, May 1990, pp.322-331.
    [20]
    Koudas N, Sevcik K C. Size separation spatial join. In Proc. the 1997 ACM SIGMOD International Conference on Management of Data, May 1997, pp.324-335.
    [21]
    Nobari S, Tauheed F, Heinis T, Karras P, Bressan S, Ailamaki A. TOUCH:In-memory spatial join by hierarchical data-oriented partitioning. In Proc. the 2013 ACM SIGMOD International Conference on Management of Data, Jun. 2013, pp.701-712.
    [22]
    Leung T Y C, Muntz R. Stream processing:Temporal query processing and optimization. In Temporal Databases:Theory, Design, and Implementation, Tassel A U, Clifford J, Gadia S et al. (eds.), Benjamin-Cummings, 1993, pp.329-355.
    [23]
    Lu H, Ooi B C, Tan K L. On spatially partitioned temporal join. In Proc. the 20th International Conference on Very Large Data Bases, Sept. 1994, pp.546-557.
    [24]
    Shen H, Ooi B C, Lu H. The TP-index:A dynamic and efficient indexing mechanism for temporal databases. In Proc. the 10th Int. Conf. Data Engineering, Feb. 1994, pp.274-281.
    [25]
    Arge L, Procopiuc O, Ramaswamy S et al. Scalable sweeping-based spatial join. In Proc. the 24th International Conference on Very Large Data Bases, Aug. 1998, pp.570-581.
    [26]
    Lo M L, Ravishankar C V. Spatial hash-joins. In Proc. the 1996 ACM SIGMOD International Conference on Management of Data, June 1996, pp.247-258.
    [27]
    Patel J M, Dewitt D J. Partition based spatial-merge join. In Proc. the 1996 ACM SIGMOD International Conference on Management of Data, Jun. 1996, pp.259-270.
    [28]
    Samet H. Foundation of Multidimensional and Metric Data Structures (1st edition). Morgan Kaufmann, 2006.
    [29]
    Enderle J, Hampel M, Seidl T. Joining interval data in relational databases. In Proc. the 2004 ACM SIGMOD International Conference on Management of Data, Jun. 2004, pp.683-694.
    [30]
    Zhang D, Tsotras V J. Index based processing of semirestrictive temporal joins. In Proc. the 9th International Symposium on Temporal Representation and Reasoning, July 2002, pp.70-77.
    [31]
    Luo J Z, Shi S F, Wang H Z, Li J Z. FrepJoin:An efficient partition-based algorithm for edit similarity join. Frontiers of Information Technology & Electronic Engineering, 2017, 18(10):1499-1510.
    [32]
    Wang C P, Wang C K, Wang H, Chen J, Ye X J. Preference join on heterogeneous data. In Proc. the 18th Asia Pacific Web Conference, Sept. 2016, pp.317-329.
    [33]
    Khayyat Z, Lucia W, Singh M, Ouzzani M, Papotti P, Quiané-Ruiz J A, Tang N, Kalnis P. Fast and scalable inequality joins. The VLDB Journal, 2017, 26(1):125-150.
  • Related Articles

    [1]Xue-Qi Li, Student, Guang-Ming Tan, Ning-Hui Sun. PIM-Align: A Processing-in-Memory Architecture for FM-Index Search Algorithm[J]. Journal of Computer Science and Technology, 2021, 36(1): 56-70. DOI: 10.1007/s11390-020-0825-3
    [2]Chang-Xuan Wan, Xi-Ping Liu. Structural Join and Staircase Join Algorithms of Sibling Relationship[J]. Journal of Computer Science and Technology, 2007, 22(2): 171-181.
    [3]XING Changyu, SUN Jizhou. An Accelerated Incremental Radiosity Algorithm[J]. Journal of Computer Science and Technology, 2000, 15(1): 47-55.
    [4]Zhi Lihong. Optimal Algorithm for Algebraic Factoring[J]. Journal of Computer Science and Technology, 1997, 12(1): 1-9.
    [5]Shi Ronghua. A Redundant Binary Algorithm for RSA[J]. Journal of Computer Science and Technology, 1996, 11(4): 416-420.
    [6]Kian-Lee Tan. Optimization of Multi-Join Queries in Shared-Nothing Systems[J]. Journal of Computer Science and Technology, 1995, 10(2): 149-162.
    [7]Hu Yunfa. An Ordering Linear Unification Algorithm[J]. Journal of Computer Science and Technology, 1989, 4(1): 23-28.
    [8]Jin Hongping, Gu Junzhong. The Optimization of Distributed Join in C-POREL System[J]. Journal of Computer Science and Technology, 1987, 2(4): 276-286.
    [9]Qiao Xiangzhen. An Efficient Parallel Algorithm for FFT[J]. Journal of Computer Science and Technology, 1987, 2(3): 174-190.
    [10]Zheng Zhijie. The Duodirun Merging Algorithm[J]. Journal of Computer Science and Technology, 1987, 2(2): 157-162.

Catalog

    Article views (37) PDF downloads (363) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return