|  Tanenbaum A S, Van Steen M. Distributed Systems: Principles and Paradigms. Upper Saddle River, NJ: Prentice Hall, 2007, pp.7-8. Fleischmann A. Distributed Systems: Software Design and Implementation. Springer-Verlag Berlin Heidelberg, 2012, pp.4-5. Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 2008, 51(1): 107-113. Zaharia M, Chowdhury M, Franklin M J et al. Spark: Cluster computing with working sets. In Proc. the 2nd USENIX Conference on Hot Topics in Cloud Computing, Jun. 2010. White T. Hadoop: The Definitive Guide (1st edition). O'Reilly Media, Jun. 2009. Zaharia M, Chowdhury M, Das T et al. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proc. the 9th USENIX Conference on Networked Systems Design and Implementation, Apr. 2012, pp.15-28. Ojala T, Pietikainen M, Harwood D. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In Proc. the 12th International Conference on Pattern Recognition (ICPR), Oct. 1994, Volume 1, pp.582-585. Ojala T, Pietikainen M, Mäenpää T. Multiresolution grayscale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 971-987. Bay H, Tuytelaars T, Van Gool L. SURF: Speeded-up robust features. In Proc. the 9th ECCV, May 2006, pp.404- 417. Ng P C, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Research, 2003, 31(13): 3812-3814. Tola E, Lepetit V, Fua P. DAISY: An efficient dense descriptor applied to wide-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(5): 815-830. Juan L, Gwun O. A comparison of SIFT, PCA-SIFT and SURF. International Journal of Image Processing (IJIP), 2009, 3(4): 143-152. Lewis J, Alghamdi M, Assaf M A et al. An automatic prefetching and caching system. In Proc. the 29th IEEE International on Performance Computing and Communications Conference (IPCCC), Dec. 2010, pp.180-187. Shvachko K, Kuang H, Radia S et al. The Hadoop distributed file system. In Proc. the 26th IEEE Symposium on Mass Storage Systems and Technologies (MSST), May 2010. Lindholm E, Nickolls J, Oberman S et al. NVIDIA Tesla: A unified graphics and computing architecture. IEEE Micro, 2008, 28(2): 39-55. Hartley T D R, Catalyurek U V, Ruiz A et al. Author's retrospective for biomedical image analysis on a cooperative cluster of gpus and multicores. In Proc. the 25th ACM International Conference on Supercomputing Anniversary Volume, Jun. 2014, pp.82-84. McGaffin M G, Fessler J. Edge-preserving image denoising via group coordinate descent on the GPU. IEEE Transactions on Image Processing, 2015, 24(4): 1273-1281. Zhu L, Jin H, Zheng R et al. Effective naive Bayes nearest neighbor based image classification on GPU. Journal of Supercomputing, 2014, 68(2): 820-848. Cornelis N, van Gool L. Fast scale invariant feature detection and matching on programmable graphics hardware. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), June 2008, pp.1-8. Wu C. SiftGPU: A GPU implementation of scale invariant feature transform (SIFT). http://cs.unc.edu/ ccwu/siftgpu, Oct. 2015. Prisacariu V, Reid I. fastHOG — A real-time GPU implementation of HOG. Technical Report 2310/09, Department of Engineering Science, University of Oxford, January 2012. Jiang D, Chen G, Ooi B C et al. epiC: An extensible and scalable system for processing big data. Proceedings of the VLDB Endowment, 2014, 7(7): 541-552. Zhang X, Yang L T, Liu C et al. A scalable two-phase topdown specialization approach for data anonymization using MapReduce on cloud. IEEE Transactions on Parallel and Distributed Systems, 2014, 25(2): 363-373. Ranger C, Raghuraman R, Penmetsa A et al. Evaluating MapReduce for multi-core and multiprocessor systems. In Proc. the 13th IEEE International Symposium on High Performance Computer Architecture (HPCA), Feb. 2007, pp.13-24. Moise D, Shestakov D, Gudmundsson G et al. Terabytescale image similarity search: Experience and best practice. In Proc. IEEE International Conference on Big Data, Oct. 2013, pp.674-682. Mills S, Eyers D, Leung K C et al. Large-scale feature matching with distributed and heterogeneous computing. In Proc. the 28th IEEE International Conference of Image and Vision Computing New Zealand (IVCNZ), Nov. 2013, pp.208-213. Teodoro G, Kurç T M, Pan T et al. Accelerating large scale image analyses on parallel, CPU-GPU equipped systems. In Proc. the 26th IEEE International on Parallel and Distributed Processing Symposium (IPDPS), May 2012, pp.1093-1104. Teodoro G, Pan T F, Kurç T M et al. High-throughput analysis of large microscopy image datasets on CPU-GPU cluster platforms. In Proc. the 27th IEEE International on Parallel and Distributed Processing Symposium (IPDPS), May 2013, pp.103-114. Hua Y, Jiang H, Feng D. FAST: Near real-time searchable data analytics for the cloud. In Proc. the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Nov. 2014, pp.754-765. Liu J, Huang Z, Cheng H et al. Presenting diverse location views with real-time near-duplicate photo elimination. In Proc. the 29th IEEE International Conference on Data Engineering (ICDE), Apr. 2013, pp.505-516. Fang W, He B, Luo Q et al. Mars: Accelerating MapReduce with graphics processors. IEEE Transactions on Parallel and Distributed Systems, 2011, 22(4): 608-620. Hong C, Chen D, Chen W et al. MapCG: Writing parallel program portable between CPU and GPU. In Proc. the 19th ACM International Conference on Parallel Architectures and Compilation Techniques (PACT), Sept. 2010, pp.217- 226. Zhai Y, Mbarushimana E, Li W et al. Lit: A high performance massive data computing framework based on CPU/GPU cluster. In Proc. IEEE International Conference on Cluster Computing (CLUSTER), Sept. 2013. Jiang H, Chen Y, Qiao Z et al. Accelerating MapReduce framework on multi-GPU systems. Cluster Computing, 2014, 17(2): 293-301. Jiang H, Chen Y, Qiao Z et al. Scaling up MapReducebased big data processing on multi-GPU systems. Cluster Computing, 2015, 18(1): 369-383. Wittek P, Darányi S N. Accelerating text mining workloads in a MapReduce-based distributed GPU environment. Journal of Parallel and Distributed Computing, 2013, 73(2): 198-206.