Performance Evaluation of a Parallel Cascade Semijoin Algorithm for Computing Path Expressions in Object Database Systems
-
Abstract
With the emerging of new applications,especially in Web, such as E-Commerce, Digital Library and DNA Bank,object database systems show their stronger functions than other kinds ofdatabase systems due to their powerful representation ability on complexsemantics and relationship. One distinguished feature of object databasesystems is path expression, and most queries on an object database arebased on path expression because it is the most natural and convenientway to access the object database, for example, to navigate thehyper-links in a web-based database. The execution of path expression isusually extremely expensive on a very large database. Therefore, theimprovement of path expression execution efficiency is critical forthe performance of object databases. As an important approach realizinghigh-performance query processing, the parallel processing of pathexpression on distributed object databases is explored in this paper. Upto now, some algorithms about how to compute path expressions and how tooptimize path expression processing have been proposed for centralizedenvironments. But, few approaches have been presented for computing pathexpressions in parallel. In this paper, a new parallel algorithm forcomputing path expression named Parallel Cascade Semijoin (PCSJ)is proposed. Moreover, a new scheduling strategy called right-deepzigzag tree is designed to further improve the performance of the PCSJalgorithm. The experiments have been implemented in an NOW distributedand parallel environment. The results show that the PCSJ algorithmoutperforms the other two parallel algorithms (the parallel version offorward pointer chasing algorithm (PFPC) and the index splittingparallel algorithm (IndexSplit) when computing path expressionswith restrictive predicates and that the right-deep zigzag treescheduling strategy has better performance than the right-deep treescheduling strategy.
-
-