Journal of Computer Science and Technology ›› 2021, Vol. 36 ›› Issue (2): 445-462.doi: 10.1007/s11390-020-9901-y

Special Issue: Artificial Intelligence and Pattern Recognition; Data Management and Data Mining

• Regular Paper • Previous Articles    

An Effective Discrete Artificial Bee Colony Based SPARQL Query Path Optimization by Reordering Triples

Zeynep Banu Ozger1 and Nurgul Yuzbasioglu Uslu2        

  1. 1 Department of Computer Engineering, Sutcu Imam University, Kahramanmaras 46040, Turkey;
    2 Department of Computer Engineering, Yildiz Technical University, Istanbul 34220, Turkey
  • Received:2019-08-01 Revised:2020-09-10 Online:2021-03-05 Published:2021-04-01
  • About author:Zeynep Banu Ozger received her Bachelor's degree in computer engineering from Yildiz Technical University, Istanbul, in 2011. She also received her Master's degree and Ph.D. degree in computer engineering from Yildiz Technical University, Istanbul, in 2014 and in 2019 respectively. She is currently an assistant professor in the Department of Computer Engineering at Sutcu Imam University, Kahramanmaras. Her research interests include artificial intelligence, machine learning and natural language processing.

Semantic Web has emerged to make web content machine-readable, and with the rapid increase in the number of web pages, its importance has increased. Resource description framework (RDF) is a special data graph format where Semantic Web data are stored and it can be queried by SPARQL query language. The challenge is to find the optimal query order that results in the shortest period of time. In this paper, the discrete Artificial Bee Colony (dABCSPARQL) algorithm is proposed, based on a novel heuristic approach, namely reordering SPARQL queries. The processing time of queries with different shapes and sizes is minimized using the dABCSPARQL algorithm. The performance of the proposed method is evaluated on chain, star, cyclic, and chain-star queries of different sizes from the Lehigh University Benchmark (LUBM) dataset. The results obtained by the proposed method are compared with those of ARQ (a SPARQL processor for Jena) query engine, the Ant System, the Elitist Ant System, and MAX-MIN Ant System algorithms. The experiments demonstrate that the proposed method significantly reduces the processing time, and in most queries, the reduction rate is higher compared with other optimization methods.

Key words: artificial bee colony; resource description framework (RDF); query optimization; reordering triple pattern; SPARQL;

[1] Berners-Lee T, Hendler J, Lassila O. The semantic web. Scientific American, 2001, 284(5):34-43. DOI:10.1038/scientificamerican0501-34.
[2] Ioannidis Y E. Query optimization. ACM Comput. Surv., 1996, 28(1):121-123. DOI:10.1145/234313.234367.
[3] Karaboga D, Akay B. A survey:Algorithms simulating bee swarm intelligence. Artificial Intelligence Review, 2009, 31(1/2/3/4):61-85. DOI:10.1007/s10462-009-9127-4.
[4] Kalayci E G, Kalayci T E, Birant D. An ant colony optimization approach for optimising SPARQL queries by reordering triple patterns. Information Systems, 2015, 50:51-68. DOI:10.1016/
[5] Li N, Li Y, Dong Y, Gu J. Application of ant colony optimization algorithm to multi-join query optimization. In Proc. the 3rd International Symposium on Computation and Intelligence, December 2008, pp.189-197. DOI:10.1007/978-3-540-92137-021.
[6] Dokeroglu T, Cosar A. Dynamic programming with ant colony optimization metaheuristic for optimization of distributed database queries. In Proc. the 26th International Symposium on Computer and Information Systems, September 2011, pp.107-113. DOI:10.1007/978-1-4471-2155-813.
[7] Shao C, Hu L M, Li J Z, Wang Z C, Chung T, Xia J B. RiMOM-IM:A novel iterative framework for instance matching. Journal of Computer Science and Technology, 2016, 31(1):185-197. DOI:10.1007/s11390-016-1620-z.
[8] Belhadi H, Akli-Astouati K, Djenouri Y, Lin J C W, Wu J M T. GFSOM:Genetic feature selection for ontology matching. In Proc. the 12th International Conference on Genetic and Evolutionary Computing, December 2018, pp.655-660. DOI:10.1007/978-981-13-5841-868.
[9] Mohammadi M, Hofman W, Tan Y H. Simulated annealingbased ontology matching. ACM Transactions on Management Information Systems, 2019, 10(1):Article No. 3. DOI:10.1145/3314948.
[10] Saharan S, Lather J S, Radhakrishnan R. Optimisation using rearrangement of order of BGP and TLBO approach. Procedia Computer Science, 2018, 125:282-289. DOI:10.1016/j.procs.2017.12.038.
[11] Hogenboom A, Frasincar F, Kaymak U. Ant colony optimization for RDF chain queries for decision support. Expert Systems with Applications, 2013, 40(5):1555-1563. DOI:10.1016/j.eswa.2012.08.074.
[12] Neumann T, Weikum G. The RDF-3X engine for scalable management of RDF data. The VLDB Journal, 2010, 19(1):91-113. DOI:10.1007/s00778-009-0165-y.
[13] Ruckhaus E, Ruiz E, Vidal M E. Query evaluation and optimization in the semantic web. Theory and Practice of Logic Programming, 2008, 8(3):393-409. DOI:10.1017/s1471068407003225.
[14] Kalayci E G, Kalayci T E. Ant colony method for SPARQL query optimization. In Proc. the 2012 Innovations in Intelligent Systems and Applications Conference, July 2012, pp.106-110.
[15] Stocker M, Seaborne A, Bernstein A, Kiefer C, Reynolds D. SPARQL basic graph pattern optimization using selectivity estimation. In Proc. the 17th International Conference on World Wide Web, April 2008, pp.595-604. DOI:10.1145/1367497.1367578.
[16] Hogenboom A, Milea V, Frasincar F, Kaymak U. RCQ-GA:RDF chain query optimization using genetic algorithms. In Proc. the 10th International Conference on Electronic Commerce and Web Technologies, September 2009, pp.181-192. DOI:10.1007/978-3-642-03964-518.
[17] Tsialiamanis P, Sidirourgos L, Fundulaki I, Christophides V, Boncz P. Heuristics-based query optimization for SPARQL. In Proc. the 15th International Conference on Extending Database Technology, March 2012, pp.324-335. DOI:10.1145/2247596.2247635.
[18] Meimaris M, Papastefanatos G. Distance-based triple reordering for SPARQL query optimization. In Proc. the 33rd International Conference on Data Engineering, April 2017, pp.1559-1562. DOI:10.1109/icde.2017.227.
[19] Wang X, Tiropanis T, Davis H C. Evaluating graph traversal algorithms for distributed SPARQL query optimization. In Proc. the 2011 Joint International Semantic Technology Conference, December 2011, pp.210-225. DOI:10.1007/978-3-642-29923-014.
[20] Chawla T, Singh G, Pilli E S. A shortest path approach to SPARQL chain query optimization. In Proc. the 2017 Advances in Computing, Communications and Informatics, September 2017, pp.1778-1778. DOI:10.1109/icacci.2017.8126102.
[21] Ramalingam G, Dhandapani S. A novel adaptive Cuckoo search for optimal query plan generation. The Scientifi World Journal, 2014, 2014:Article No. 727658. DOI:10.1155/2014/727658.
[22] Ramalingam G, Dhandapani S. A hybrid batcs algorithm to generate optimal query plan. Int. Arab J. Inf. Technol., 2018, 15(3):353-359.
[23] Hogenboom A, Niewenhuijse E, Jansen M, Frasincar F, Vandic D. RDF chain query optimization in a distributed environment. In Proc. the 30th Annual ACM Symposium on Applied Computing, April 2015, pp.353-359. DOI:10.1145/2695664.2695711.
[24] Gubichev A, Neumann T. Exploiting the query structure for efficient join ordering in SPARQL queries. In Proc. the 17th International Conference on Extending Database Technology, March 2014, pp.439-450.
[25] Vandervalk B P, McCarthy E L, Wilkinson M D. Optimization of distributed SPARQL queries using Edmonds' algorithm and Prim's algorithm. In Proc. the 2009 International Conference on Computational Science and Engineering, August 2009, pp.330-337. DOI:10.1109/cse.2009.144.
[26] Liu C, Wang H, Yu Y, Xu L. Towards efficient SPARQL query processing on RDF data. Tsinghua Science and Technology, 2010, 15(6):613-622. DOI:10.1016/s1007-0214(10)70108-5.
[27] Saharan S, Lather J S, Radhakrishnan R. Optimization of different queries using optimization algorithm (DE). International Journal of Computer Network and Information Security, 2018, 10(3):52-59. DOI:10.5815/ijcnis.2018.03.06.
[28] Saharan S, Lather J S, Radhakrishnan R. Specific queries optimization using Jaya approach. International Journal of Modern Education and Computer Science, 2018, 10(3):38-46. DOI:10.5815/ijmecs.2018.03.05.
[29] Rao R. Jaya:A simple and new optimization algorithm for solving constrained and unconstrained optimization problems. International Journal of Industrial Engineering Computations, 2016, 7(1):19-34. DOI:10.5267/j.ijiec.2015.8.004.
[30] Ramalingam G, Dhandapani S. A hybrid nature inspired algorithm for generating optimal query plan. World Academy of Science, Engineering and Technology, International Journal of Computer and Information Engineering, 2014, 8(8):1519-1524.
[31] Zhang W E, Sheng Q Z, Qin Y, Taylor K, Yao L. Learningbased SPARQL query performance modeling and prediction.World Wide Web, 2018, 21(4):1015-1035. DOI:10.1007/s11280-017-0498-1.
[32] Li J, König A C, Narasayya V, Chaudhuri S. Robust estimation of resource consumption for SQL queries using statistical techniques. Proceedings of the VLDB Endowment, 2012, 5(11):1555-1566. DOI:10.14778/2350229.2350269.
[33] Wu W, Chi Y, Zhu S, Tatemura J, Hacigümüs H, Naughton J F. Predicting query execution time:Are optimizer cost models really unusable? In Proc. the 29th IEEE International Conference on Data Engineering, April 2013, pp.1081-1092. DOI:10.1109/icde.2013.6544899.
[34] Neumann T, Weikum G. RDF-3X:A RISC-style engine for RDF. Proceedings of the VLDB Endowment, 2008, 1(1):647-659. DOI:10.14778/1453856.1453927.
[35] Akdere M, Çetintemel U, Riondato M, Upfal E, Zdonik S B. Learning-based query performance modeling and prediction. In Proc. the 28th International Conference on Data Engineering, April 2012, pp.390-401. DOI:10.1109/icde.2012.64.
[36] Tozer S, Brecht T, Aboulnaga A. Q-Cop:Avoiding bad query mixes to minimize client timeouts under heavy loads. In Proc. the 26th International Conference on Data Engineering, March 2010, pp.397-408. DOI:10.1109/icde.2010.5447850.
[37] Frosini R, Cali A, Poulovassilis A, Wood P T. Flexible query processing for SPARQL. Semantic Web, 2017, 8(4):533-563.
[38] Fletcher G H L. An algebra for basic graph patterns. In Proc. the Workshop on Logic in Databases, May 2008.
[39] Carroll J J, Dickinson I, Dollin C, Reynolds D, Seaborne A, Wilkinson K. Jena:Implementing the semantic web recommendations. In Proc. the 13th International Conference on World Wide Web, May 2004, pp.74-83. DOI:10.1145/1013367.1013381.
[40] Steinbrunn M, Moerkotte G, Kemper A. Heuristic and randomized optimization for the join ordering problem. The International Journal on Very Large Data Bases, 1997, 6(3):191-208. DOI:10.1007/s007780050040.
[41] Guo Y, Pan Z, Heflin J. LUBM:A benchmark for OWL knowledge base systems. Journal of Web Semantics, 2005, 3(2/3):158-182. DOI:10.1016/j.websem.2005.06.005.
[42] Kulkarni P. Distributed SPARQL query engine using MapReduce[Master Thesis]. University of Edinburgh School of Informatics, 2010.
[43] Dorigo M, Stutzle T. Ant Colony Optimization. MIT Press, 2004.
[44] Dorigo M, Maniezzo V, Colorni A. Positive feedback as a search strategy. Technical Report, Politecnicodi Milano, 1991., March 2020.
[45] Dorigo M, Maniezzo V, Colorni A. Ant system:Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, 1996, 26(1):29-41. DOI:10.1109/3477.484436.
[1] Xue-Li Liu, Hong-Zhi Wang, Jian-Zhong Li, Hong Gao. EntityManager: Managing Dirty Data Based on Entity Resolution [J]. , 2017, 32(3): 644-661.
[2] Hua-Ming Liao and Guo-Shun Pei. Cache-Based Aggregate Query Shipping: An Efficient Scheme of Distributed OLAP Query Processing [J]. , 2008, 23(6 ): 905-915 .
[3] Zhen-Hua Huang, Jian-Kui Guo, Sheng-Li Sun, and Wei Wang. Efficient Optimization of Multiple Subspace Skyline Queries [J]. , 2008, 23(1): 103-111 .
[4] Dun-Ren Che. Accomplishing Deterministic XML Query Optimization [J]. , 2005, 20(3): 357-366 .
[5] Zhou Aoying; Shi Baile;. Query Optimization for Deductive Databases [J]. , 1995, 10(2): 134-148.
Full text



[1] Zhou Di;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] Chen Shihua;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] Qu Yanwen;. AGDL: A Definition Language for Attribute Grammars[J]. , 1986, 1(3): 80 -91 .
[4] Wang Jianchao; Wei Daozheng;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[5] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[6] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[7] Zheng Guoliang; Li Hui;. The Design and Implementation of the Syntax-Directed Editor Generator(SEG)[J]. , 1986, 1(4): 39 -48 .
[8] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[9] Huang Xuedong; Cai Lianhong; Fang Ditang; Chi Bianjin; Zhou Li; Jiang Li;. A Computer System for Chinese Character Speech Input[J]. , 1986, 1(4): 75 -83 .
[10] Xu Xiaoshu;. Simplification of Multivalued Sequential SULM Network by Using Cascade Decomposition[J]. , 1986, 1(4): 84 -95 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved