Special Issue: Data Management and Data Mining
• Data Management and Data Mining • Next Articles
Rui Ren1,2, Member, CCF, IEEE, Jiechao Cheng3, Xi-Wen He1, Lei Wang1, Member, CCF, Jian-Feng Zhan1,*, Member, CCF, ACM, IEEE, Wan-Ling Gao1, Member, CCF, ACM, IEEE, Chun-Jie Luo1,2, Member, CCF
|  Dai J, Huang J, Huang S, Huang B, Liu Y. HiTune:Dataflow-based performance analysis for big data cloud. In Proc. the 2011 USENIX Conference on USENIX Annual Technical Conference, June 2011, Article No. 27.
 Guo Q, Li Y, Liu T, Wang K, Chen G, Bao X, Tang W. Correlation-based performance analysis for full-system MapReduce optimization. In Proc. the 2013 IEEE International Conference on Big Data, October 2013, pp.753-761.
 Garduño E, Kavulya S P, Tan J, Gandhi R, Narasimhan P. Theia:Visual signatures for problem diagnosis in large Hadoop clusters. In Proc. the 26th Large Installation System Administration Conference, December 2012, pp.33-42.
 Tan J, Pan X, Kavulya S, Gandhi R, Narasimhan P. Mochi:Visual log-analysis based tools for debugging Hadoop. In Proc. USENIX Workshop on Hot Topics in Cloud Computing, June 2009, Article No. 1.
 Cretu-Ciocarlie G, Budiu M, Goldszmidt M. Hunting for problems with Artemis. In Proc. the 1st USENIX Workshop on Analysis of System Logs, Dec. 2008, Article No. 2.
 Herodotou H, Lim H, Luo G, Borisov N, Dong L, Cetin F, Babu S. Starfish:A self-tuning system for big data analytics. In Proc. the 5th Biennial Conference on Innovative Data Systems Research, January 2011, pp.261-272.
 Wang L, Zhan J, Luo C, Zhu Y, Yang Q, He Y, Gao W, Jia Z, Shi Y, Zhang S, Zheng C, Lu G, Zhan K, Qiu B. BigDataBench:A Big Data benchmark suite from internet services. In Proc. the 20th IEEE International Symposium on High Performance Computer Architecture, February 2014, pp.488-499.
 Ananthanarayanan G, Kandula S, Greenberg A, Stoica I, Lu Y, Saha B, Harris E. Reining in the outliers in MapReduce clusters using Mantri. In Proc. the 9th USENIX Conference on Operating Systems Design and Implementation, October 2010, pp.265-278.
 Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin M, Shenker S, Stoica I. Resilient distributed datasets:A fault-tolerant abstraction for in-memory cluster computing. In Proc. the 9th USENIX Symposium on Networked Systems Design and Implementation, April 2012, pp.15-28.
 Isard M, Budiu M, Yu Y, Birrell A, Fetterly D. Dryad:Distributed data-parallel programs from sequential building blocks. In Proc. the 2007 EuroSys Conference, March 2007, pp.59-72.
 Ren R, Jia Z, Wang L, Zhan J, Yi T. BDTUne:Hierarchical correlation-based performance analysis and rule-based diagnosis for big data systems. In Proc. the IEEE International Conference on Big Data, Dec. 2016, pp.555-562.
 Cochran W, Cooley J, Favin D, Helms H, Kaenel R, Langa W, Maling G, Nelson D, Rader C, Welch P. What is the fast Fourier transform? IEEE Transactions on Audio and Electroacoustics, 1967, 55(10):1664-1674.
 Knorr E M, Ng R T. Algorithms for mining distancebased outliers in large datasets. In Proc. the 24th International Conference on Very Large Data Bases, August 1998, pp.392-403.
 Ming Z, Luo C, Gao W, Han R, Yang Q, Wang L, Zhan J. BDGS:A scalable Big Data generator suite in Big Data benchmarking. In Proc. the 2013 Workshop Series on Big Data Benchmarking, July 2014, pp.138-154.
 Meng X, Bradley J, Yavuz B, Sparks E, Venkataraman S, Liu D, Freeman J, Tsai D B, Amde M, Owen S, Xin D, Xin R, Franklin M J, Zadeh R, Zaharia M, Talwalkar A. MLlib:Machine learning in Apache Spark. J. Mach. Learn. Res., 2016, 17:Article No. 34.
 Wang C, Talwar V, Schwan K, Ranganathan P. Online detection of utility cloud anomalies using metric distributions. In Proc. the IEEE/IFIP Network Operations and Management Symposium, April 2010, pp.96-103.
 Ousterhout K, Rasti R, Ratnasamy S, Shenker S, Chun B. Making sense of performance in data analytics frameworks. In Proc. the 12th USENIX Symposium on Networked Systems Design and Implementation, May 2015, pp.293-307.
 Jayathilaka H, Krintz C, Wolski R. Detecting performance anomalies in cloud platform applications. IEEE Transactions on Cloud Computing. doi:10.1109/TCC.2018.2808289.
 Ramaswamy S, Rastogi R, Shim K. Efficient algorithms for mining outliers from large data sets. In Proc. the 2000 ACM SIGMOD International Conference on Management of Data, May 2000, pp.427-438.
 Breunig M M, Kriegel H P, Ng R T, Sander J. LOF:Identifying density-based local outliers. In Proc. ACM SIGMOD International Conference on Management of Data, May 2000, pp.93-104.
 Yu D, Sheikholeslami G, Zhang A. FindOut:Finding outliers in very large datasets. Knowledge and Information Systems, 2002, 4(4):387-412.
 Yu L, Lan Z. A scalable, non-parametric method for detecting performance anomaly in large scale computing. IEEE Transactions on Parallel and Distributed Systems, 2016, 27(7):1902-1914.
 Tan J, Pan X, Marinelli E, Kavulya S, Gandhi R, Narasimhan P. Kahuna:Problem diagnosis for MapReducebased cloud computing environments. In Proc. the IEEE/IFIP Network Operations and Management Symposium, April 2010, pp.112-119.
 Pan X, Tan J, Kavulya S, Gandhi R, Narasimhan P. Ganesha:BlackBox diagnosis of MapReduce systems. SIGMETRICS Performance Evaluation Review, 2009, 37(3):8-13.
 Gupta C, Sinha R, Zhang Y. Eagle:User profile-based anomaly detection for securing Hadoop clusters. In Proc. the 2015 IEEE International Conference on Big Data, October 2015, pp.1336-1343.
 Kasick M P, Tan J, Gandhi R, Narasimhan P. Black-box problem diagnosis in parallel file systems. In Proc. the 8th USENIX Conference on File and Storage Technologies, February 2010, pp.43-56.
 Fu X, Ren R, McKeez S A, Zhan J, Sun N. Digging deeper into cluster system logs for failure prediction and root cause diagnosis. In Proc. IEEE International Conference on Cluster Computing, September 2014, pp.103-112.
 Khan L, Awad M, Thuraisingham B. A new intrusion detection system using support vector machines and hierarchical clustering. The VLDB Journal, 2007, 16(4):507-521.
 Lee S, Shin K G. Probabilistic diagnosis of multiprocessor systems. ACM Computing Surveys, 1994, 26(1):121-139.
 Das K, Schneider J. Detecting anomalous records in categorical datasets. In Proc. the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2007, pp.220-229.
 Mi H, Wang H, Zhou Y, Lyu M R, Cai H. Toward finegrained, unsupervised, scalable performance diagnosis for production cloud computing systems. IEEE Transactions on Parallel and Distributed Systems, 2013, 24(6):1245-1255.
 Jia T, Chen P, Yang L, Li Y, Meng F, Xu J. An approach for anomaly diagnosis based on hybrid graph model with logs for distributed services. In Proc. the 2017 IEEE International Conference on Web Services, June 2017, pp.25-32.
 Ren R, Tian S, Wang L. Online anomaly detection framework for Spark systems via stage-task behavior modeling. In Proc. the 15th ACM International Conference on Computing Frontiers, May 2018, pp.256-259.
|||Geun Yong Kim, Joon-Young Paik, Yeongcheol Kim, and Eun-Sun Cho. Byte Frequency Based Indicators for Crypto-Ransomware Detection from Empirical Analysis [J]. Journal of Computer Science and Technology, 2022, 37(2): 423-442.|
|||Jian-Zhe Zhao, Xing-Wei Wang, Ke-Ming Mao, Chen-Xi Huang, Yu-Kai Su, and Yu-Chen Li. Correlated Differential Privacy of Multiparty Data Release in Machine Learning [J]. Journal of Computer Science and Technology, 2022, 37(1): 231-251.|
|||Yi Zhong, Jian-Hua Feng, Xiao-Xin Cui, Xiao-Le Cui. Machine Learning Aided Key-Guessing Attack Paradigm Against Logic Block Encryption [J]. Journal of Computer Science and Technology, 2021, 36(5): 1102-1117.|
|||Jian-Wei Cui, Wei Lu, Xin Zhao, Xiao-Yong Du. Efficient Model Store and Reuse in an OLML Database System [J]. Journal of Computer Science and Technology, 2021, 36(4): 792-805.|
|||Sara Elmidaoui, Laila Cheikhi, Ali Idri, Alain Abran. Machine Learning Techniques for Software Maintainability Prediction: Accuracy Analysis [J]. Journal of Computer Science and Technology, 2020, 35(5): 1147-1174.|
|||Andrea Caroppo, Alessandro Leone, Pietro Siciliano. Comparison Between Deep Learning Models and Traditional Machine Learning Approaches for Facial Expression Recognition in Ageing Adults [J]. Journal of Computer Science and Technology, 2020, 35(5): 1127-1146.|
|||Shu-Zheng Zhang, Zhen-Yu Zhao, Chao-Chao Feng, Lei Wang. A Machine Learning Framework with Feature Selection for Floorplan Acceleration in IC Physical Design [J]. Journal of Computer Science and Technology, 2020, 35(2): 468-474.|
|||João Fabrício Filho, Luis Gustavo Araujo Rodriguez, Anderson Faustino da Silva. Yet Another Intelligent Code-Generating System: A Flexible and Low-Cost Solution [J]. Journal of Computer Science and Technology, 2018, 33(5): 940-965.|
|||Lan Yao, Feng Zeng, Dong-Hui Li, Zhi-Gang Chen. Sparse Support Vector Machine with Lp Penalty for Feature Selection [J]. , 2017, 32(1): 68-77.|
|||Xin-Qi Bao, Yun-Fang Wu. A Tensor Neural Network with Layerwise Pretraining: Towards Effective Answer Retrieval [J]. , 2016, 31(6): 1151-1160.|
|||Najam Nazar, Yan Hu, He Jiang. Summarizing Software Artifacts: A Literature Review [J]. , 2016, 31(5): 883-909.|
|||Xi-Jin Zhang, Yi-Fan Lu, Song-Hai Zhang. Multi-Task Learning for Food Identification and Analysis with Deep Convolutional Neural Networks [J]. , 2016, 31(3): 489-500.|
|||Lixue Xia, Peng Gu, Boxun Li, Tianqi Tang, Xiling Yin, Wenqin Huangfu, Shimeng Yu, Yu Cao, Yu Wang, Huazhong Yang. Technological Exploration of RRAM Crossbar Array for Matrix-Vector Multiplication [J]. , 2016, 31(1): 3-19.|
|||Jun-Fa Liu, Wen-Jing He, Tao Chen, and Yi-Qiang Chen. Manifold Constrained Transfer of Facial Geometric Knowledge for 3D Caricature Reconstruction [J]. , 2013, 28(3): 479-489.|
|||Yuan Jiang (姜远), Member, CCF, Ming Li (黎铭), Member, CCF, ACM, IEEE, and Zhi-Hua Zhou (周志华), Senior Member, CCF, IEEE, <. Software Defect Detection with ROCUS [J]. , 2011, 26(2): 328-342.|