计算机科学技术学报 ›› 2019,Vol. 34 ›› Issue (4): 762-774.doi: 10.1007/s11390-019-1941-9

所属专题: Data Management and Data Mining

• • 上一篇    下一篇

大规模RDF流数据处理的统一框架

Hong Fang1, Bo Zhao2,3, Xiao-Wang Zhang2,3,*, Member, CCF, Xuan-Xing Yang2,3   

  1. 1 College of Arts and Sciences, Shanghai Polytechnic University, Shanghai 201209, China;
    2 College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;
    3 Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
  • 收稿日期:2019-01-15 修回日期:2019-05-09 出版日期:2019-07-11 发布日期:2019-07-11
  • 通讯作者: Xiao-Wang Zhang E-mail:xiaowangzhang@tju.edu.cn
  • 作者简介:Hong Fang is an associate professor in the Department of Applied Mathematics,College of Arts and Sciences,Shanghai Polytechnic University,Shanghai.Before that,she received her Ph.D.degree in computer science and technology from Anhui University,Hefei,in 2008.Her research interests include data management and data query,and image processing.
  • 基金资助:
    This paper is supported by the National Key Research and Development Program of China under Grant No. 2017YFC0908401, and the National Natural Science Foundation of China under Grant No. 61672377. Xiao-Wang Zhang is supported by the program of Peiyang Young Scholars of China under Grant No. 2019XRX-0032.

A United Framework for Large-Scale Resource Description Framework Stream Processing

Hong Fang1, Bo Zhao2,3, Xiao-Wang Zhang2,3,*, Member, CCF, Xuan-Xing Yang2,3   

  1. 1 College of Arts and Sciences, Shanghai Polytechnic University, Shanghai 201209, China;
    2 College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;
    3 Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300350, China
  • Received:2019-01-15 Revised:2019-05-09 Online:2019-07-11 Published:2019-07-11
  • Contact: Xiao-Wang Zhang E-mail:xiaowangzhang@tju.edu.cn
  • Supported by:
    This paper is supported by the National Key Research and Development Program of China under Grant No. 2017YFC0908401, and the National Natural Science Foundation of China under Grant No. 61672377. Xiao-Wang Zhang is supported by the program of Peiyang Young Scholars of China under Grant No. 2019XRX-0032.

RDF流数据是处理时空相关信息的有效数据模型。本文中,我们提出一个处理大规模RDF流数据的统一框架(LRSP)。首先,我们将现有的持续查询进行了统一的形式化表示。其次,我们通过将流数据处理,查询解析和查询执行三个模块分离,提出一种基于白盒设计的,针对RDF流数据的持续查询框架。最后我们设计实现了LRSP,并基于一些基准数据和真实世界数据,通过实验比较LRSP与其他现有引擎的性能。由于LRSP的体系结构特点,我们可以直接使用其他的高效RDF查询引擎(包括集中式引擎和分布式引擎)来处理RDF流数据上的持续查询。实验结果表明:LRSP具有较高的性能,尤其表现在处理大规模真实世界数据时。

关键词: RDF流数据, 持续查询, C-SPARQL, CQELS, EP-SPARQL

Abstract: Resource description framework (RDF) stream is useful to model spatio-temporal data. In this paper, we propose a framework for large-scale RDF stream processing, LRSP, to process general continuous queries over large-scale RDF streams. Firstly, we propose a formalization (named CT-SPARQL) to represent the general continuous queries in a unified, unambiguous way. Secondly, based on our formalization we propose LRSP to process continuous queries in a common white-box way by separating RDF stream processing, query parsing, and query execution. Finally, we implement and evaluate LRSP with those popular continuous query engines on some benchmark datasets and real-world datasets. Due to the architecture of LRSP, many efficient query engines (including centralized and distributed engines) for RDF can be directly employed to process continuous queries. The experimental results show that LRSP has a higher performance, specially, in processing large-scale real-world data.

Key words: resource description framework(RDF)stream, continuous query, united framework, stream processing, largescale RDF stream processing(LRSP)

[1] Barbieri D F, Braga D, Ceri S, Valle E D, Grossniklaus M. Querying RDF streams with C-SPARQL. ACM SIGMOD Record, 2010, 39(1):20-26.
[2] Le-Phuoc D, Dao-Tran M, Parreira J X, Hauswirth M. A native and adaptive approach for unified processing of linked streams and linked data. In Proc. the 10th Int. Semantic Web Conference, October 2011, pp.370-388.
[3] Anicic D, Fodor P, Rudolph S, Stojanovic N. EP-SPARQL:A unified language for event processing and stream reasoning. In Proc. the 20th Int. Conference on World Wide Web, March 2011, pp.635-644.
[4] Zou L, Özsu M T, Chen L, Shen X, Huang R, Zhao D. gStore:A graph-based SPARQL query engine. The VLDB Journal, 2014, 23(4):565-590.
[5] Neumann T, Weikum G. The RDF-3X engine for scalable management of RDF data. The VLDB Journal, 2010, 19(1):91-113.
[6] Peng P, Zou L, Özsu M T, Chen L, Zhao D. Processing SPARQL queries over distributed RDF graphs. The VLDB Journal, 2016, 25(2):243-268.
[7] Gurajada S, Seufert S, Miliaraki I, Theobald M. TriAD:A distributed shared-nothing RDF engine based on asynchronous message passing. In Proc. the 2014 ACM SIGMOD Int. Conference on Management of Data, June 2014, pp.289-300.
[8] Li Q, Zhang X, Feng Z. PRSP:A plugin-based framework for RDF stream processing. In Proc. the 26th Int. Conference on World Wide Web Companion, April 2017, pp.815-816.
[9] Alessandro M, Gianpaolo C. Processing flows of information:From data stream to complex event processing. In Proc. the 5th ACM Int. Conference on Distributed EventBased Systems, July 2011, pp.359-360.
[10] Kolchin M, Wetz P, Kiesling E, Tjoa A M. YABench:A comprehensive framework for RDF stream processor correctness and performance assessment. In Proc. the 16th International Conference on Web Engineering, June 2016, pp.280-298.
[11] Arasu A, Babu S, Widom J. The CQL continuous query language:Semantic foundations and query execution. The VLDB Journal, 2006, 15(2):121-142.
[12] Carroll J J, Dickinson I, Dollin C, Reynolds D, Seaborne A, Wilkinson K. Jena:Implementing the semantic web recommendations. In Proc. the 13th Int. Conference on World Wide Web-Alternate Track Papers & Posters, May 2004, pp.74-83.
[13] Ren X, Curé O. Strider:A hybrid adaptive distributed RDF stream processing engine. In Proc. the 16th Int. Semantic Web Conference, October 2017, pp.559-576.
[14] Dell'Aglio D, Valle E D, Calbimonte J P, Corcho Ó. RSP-QL semantics:A unifying query model to explain heterogeneity of RDF stream processing systems. Int. Journal on Semantic Web and Information Systems, 2014, 10(4):17-44.
[15] Dell'Aglio D, Calbimonte J P, Valle E D, Corcho Ó. To-wards a unified language for RDF stream query processing. In Proc. the 12th European Semantic Web Conference, May 2015, pp.353-363.
[16] Brandt S, Kalayci E G, Ryzhikov V, Xiao G, Zakharyaschev M. Querying log data with temporal logic. Journal of Artificial Intelligence Research, 2018, 62:829-877.
[17] Li L, Kim J, Xu J, Zhou X. Time-dependent route scheduling on road networks. ACM SIGSPATIAL Special, 2018, 10(1):10-14.
[18] Qian Z, Xu J, Zheng K, Zhao P, Zhou X. Semantic-aware top-k spatial keyword queries. World Wide Web:Internet and Web Information Systems, 2018, 21(3):573-594.
[19] Lanti D, Xiao G, Calvanese D. VIG:Data scaling for OBDA benchmarks. Semantic Web, 2019, 10(2):413-433.
[20] Zhao B. Research on adaptive RDF stream processing architecture[Master Thesis]. College of Intelligence and Computing, Tianjin University, 2018. (in Chinese)
[21] Li J, Liu C, Yu J X, Chen Y, Sellis T, Culpepper J S. Personalized influential topic search via social network summarization. IEEE Transactions on Konwledge and Data Engineering, 2016, 28(7):1820-1834.
[22] Li J, Sellis T, Culpepper J S, He Z, Liu C, Wang J. Geosocial influence spanning maximization. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(8):1653-1666.
[23] Li J, Wang X, Deng K, Yang X, Sellis T, Yu J X. Most influential community search over large social networks. In Proc. the 33rd IEEE Int. Conference on Data Engineering, April 2017, pp.871-882.
[24] Li J, Liu C, Islam M S. Keyword-based correlated network computation over large social media. In Proc. the 30th IEEE Int. Conference on Data Engineering, March 2014, pp.268-279.
[25] Li J, Cai T, Mian A, Li R, Sellis T, Yu J X. Holistic influence maximization for targeted advertisements in spatial social networks. In Proc. the 34th IEEE Int. Conference on Data Engineering, April 2018, pp.1340-1343.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 闵应骅;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[2] 程锦松;. A Parallel Algorithm for Finding Roots of a Complex Polynomial[J]. , 1990, 5(1): 71 -81 .
[3] 俞士汶;. Application of Grammatical Parsing Technique in Chinese Input[J]. , 1990, 5(4): 312 -318 .
[4] 虞慧群; 孙永强;. Hybridity in Embedded Computing Systems[J]. , 1996, 11(1): 90 -96 .
[5] 孙凝晖; 刘文卓; 刘宏; 王川宝; 陆雪琳; 张浩;. Dawning-1000 PROOS Distributed Operating System[J]. , 1997, 12(2): 160 -166 .
[6] 沈理;. Fuzzy Logic Control ASIC Chip[J]. , 1997, 12(3): 263 -270 .
[7] . 以客户端为核心的面向服务应用的适应性调度[J]. , 2006, 21(4): 537 -546 .
[8] Xiao-Dong Li, Wen-Jian Luo, and Xin Yao. Preface[J]. , 2008, 23(1): 1 .
[9] . 暂缺[J]. , 2008, 23(2): 188 -202 .
[10] 朱晓敏 陆佩忠. 异构集群系统中实时任务多维调度研究[J]. , 2009, 24(3): 434 -446 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: