›› 2013,Vol. 28 ›› Issue (6): 962-972.doi: 10.1007/s11390-013-1391-8

所属专题: Data Management and Data Mining

• Special Section on Selected Paper from NPC 2011 • 上一篇    下一篇

一种基于Chord拓扑结构支持树形索引的框架

Ming-Dong Zhu (朱命冬), Student Member, CCF, Member, ACM De-Rong Shen (申德荣), Senior Member, CCF, Member, ACM, IEEE, Kou Yue (寇月), Member, CCF, ACM, Tie-Zheng Nie (聂铁铮), Member, CCF, ACM, and Ge Yu (于戈), Senior Member, CCF, Member, ACM, IEEE   

  • 收稿日期:2012-12-03 修回日期:2013-04-28 出版日期:2013-11-05 发布日期:2013-11-05
  • 作者简介:Ming-Dong Zhu received his M.S. degree in computer architecture from Northeastern University of China, Shenyang, in 2009. He is currently a Ph.D. candidate in computer software and theory, Northeastern University. His research interests include similarity query on large data and data analysis in distributed database.

A Framework for Supporting Tree-Like Indexes on the Chord Overlay

Ming-Dong Zhu (朱命冬), Student Member, CCF, Member, ACM De-Rong Shen (申德荣), Senior Member, CCF, Member, ACM, IEEE, Kou Yue (寇月), Member, CCF, ACM, Tie-Zheng Nie (聂铁铮), Member, CCF, ACM, and Ge Yu (于戈), Senior Member, CCF, Member, ACM, IEEE   

  1. College of Information Science and Engineering, Northeastern University, Shenyang 110004, China
  • Received:2012-12-03 Revised:2013-04-28 Online:2013-11-05 Published:2013-11-05
  • Supported by:

    This research was supported by the National Basic Research 973 Program of China under Grant No. 2012CB316201, the National Natural Science Foundation of China under Grant Nos. 60973021, 61033007, 61003060, and the Fundamental Research Funds for the Central Universities of China under Grant No. N100704001.

在数据量迅速增长的情况下,为了对各种各样的数据类型提供高效的数据查询和管理,分布式数据库需要支持R-tree, B-tree, M-tree等树形索引。 在分布式环境下,索引一般会被分散到各个计算机节点上从而提高可靠性和可扩展性。但是,分布式索引在提高查询速度的同时也增加了索引更新的代价。每一个计算机节点维护一部分索引,必然导致通信代价比较高。所以降低通信代价成为分布式数据管理的关键,否则大量的带宽会被占用,数据库的可扩展性和可用性没有办法保证。一般情况下,为了提高查询通量,多个索引的副本会被创建,但是当数据更新时保持这些副本的一致性变得非常复杂。
本文提出了一种基于Chord拓扑结构支持树形索引的框架。这种框架根据树形索引的特点可以动态的调整索引的副本数量从而使查询代价和更新代价达到平衡。本文也提出了一个基于查询和更新的代价模型。本文设计了多种优化技术从而在不影响查询效率的情况下提高更新的性能,因为更新性能是各种分布式数据管理的瓶颈。我们在框架中实现了M-tree和R-tree,通过真实数据和合成数据的实验证明了着这种框架的有效性。

Abstract: With the explosive growth of data, to support efficient data management including queries and updates, the database system is expected to provide tree-like indexes, such as R-tree, M-tree, B+-tree, according to different types of data. In the distributed environment, the indexes have to be scattered across the compute nodes to improve reliability and scalability. Indexes can speed up queries, but they incur maintenance cost when updates occur. In the distributed environment, each compute node maintains a subset of an index tree, so keeping the communication cost small is more crucial, or else it occupies lots of network bandwidth and the scalability and availability of the database system are affected. Further, to achieve the reliability and scalability for queries, several replicas of the index are needed, but keeping the replicas consistent is not straightforward. In this paper, we propose a framework supporting tree-like indexes, based on Chord overlay, which is a popular P2P structure. The framework dynamically tunes the number of replicas of index to balance the query cost and the update cost. Several techniques are designed to improve the efficiency of updates without the cost of performance of the queries. We implement M-tree and R-tree in our framework, and extensive experiments on reallife and synthetic datasets verify the efficiency and scalability of our framework.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘明业; 洪恩宇;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] 陈世华;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] 高庆狮; 张祥; 杨树范; 陈树清;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] 黄河燕;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] 闵应骅; 韩智德;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] 唐同诰; 招兆铿;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] 闵应骅;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] 朱鸿;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] 李明慧;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: