We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Jia-Qing Dong, Ze-Hao He, Yuan-Yuan Gong, Pei-Wen Yu, Chen Tian, Wan-Chun Dou, Gui-Hai Chen, Nai Xia, Hao-Ran Guan. SMART: Speedup Job Completion Time by Scheduling Reduce Tasks[J]. Journal of Computer Science and Technology, 2022, 37(4): 763-778. DOI: 10.1007/s11390-022-2118-5
Citation: Jia-Qing Dong, Ze-Hao He, Yuan-Yuan Gong, Pei-Wen Yu, Chen Tian, Wan-Chun Dou, Gui-Hai Chen, Nai Xia, Hao-Ran Guan. SMART: Speedup Job Completion Time by Scheduling Reduce Tasks[J]. Journal of Computer Science and Technology, 2022, 37(4): 763-778. DOI: 10.1007/s11390-022-2118-5

SMART: Speedup Job Completion Time by Scheduling Reduce Tasks

Funds: This work was supported by the National Key Research and Development Project of China under Grant No. 2020YFB1707600, the National Natural Science Foundation of China under Grant Nos. 62072228, 61972222 and 92067206, the Fundamental Research Funds for the Central Universities of China, the Collaborative Innovation Center of Novel Software Technology and Industrialization, and the Jiangsu Innovation and Entrepreneurship (Shuangchuang) Program.
More Information
  • Author Bio:

    Wan-Chun Dou received his Ph.D. degree in mechanical and electronic engineering from the Nanjing University of Science and Technology, Nanjing, in 2001. He is currently a full professor of the State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing. He visited the Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, as a visiting scholar in 2005 and 2008. His research interests include workflow, cloud computing, and service computing.

  • Corresponding author:

    Wan-Chun Dou E-mail:douwc@nju.edu.cn

  • Received Date: December 26, 2021
  • Revised Date: June 26, 2022
  • Accepted Date: June 28, 2022
  • Published Date: July 24, 2022
  • Distributed computing systems have been widely used as the amount of data grows exponentially in the era of information explosion. Job completion time (JCT) is a major metric for assessing their effectiveness. How to reduce the JCT for these systems through reasonable scheduling has become a hot issue in both industry and academia. Data skew is a common phenomenon that can compromise the performance of such distributed computing systems. This paper proposes SMART, which can effectively reduce the JCT through handling the data skew during the reducing phase. SMART predicts the size of reduce tasks based on part of the completed map tasks and then enforces largest-first scheduling in the reducing phase according to the predicted reduce task size. SMART makes minimal modifications to the original Hadoop with only 20 additional lines of code and is readily deployable. The robustness and the effectiveness of SMART have been evaluated with a real-world cluster against a large number of datasets. Experiments show that SMART reduces JCT by up to 6.47%, 9.26%, and 13.66% for Terasort, WordCount and InvertedIndex respectively with the Purdue MapReduce benchmarks suite (PUMA) dataset.
  • [1]
    Grandl R, Kandula S, Rao S, Akella A, Kulkarni J. Graphene: Packing and dependency-aware scheduling for data-parallel clusters. In Proc. the 12th USENIX Conference on Operating Systems Design and Implementation, Nov. 2016, pp.81-97.
    [2]
    Chang H, Kodialam M, Kompella R R, Lakshman T V, Lee M, Mukherjee S. Scheduling in MapReduce-like systems for fast completion time. In Proc. the 30th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, Apr. 2011, pp.3074-3082. DOI: 10.1109/INFCOM.2011.5935152.
    [3]
    Peng Y, Chen K, Wang G, Bai W, Zhao Y, Wang H, Geng Y, Ma Z, Gu L. Towards comprehensive traffic forecasting in cloud computing: Design and application. IEEE/ACM Transactions on Networking, 2016, 24(4): 2210-2222. DOI: 10.1109/TNET.2015.2458892.
    [4]
    Ullah I, Khan M S, Amir M, Kim J, Kim S M. LSTPD: Least slack time-based preemptive deadline constraint scheduler for Hadoop clusters. IEEE Access, 2020, 8: 111751-111762. DOI: 10.1109/ACCESS.2020.3002565.
    [5]
    Gao Y, Zhou Y, Zhou B, Shi L, Zhang J. Handling data skew in MapReduce cluster by using partition tuning. Journal of Healthcare Engineering, 2017, 2017: Article No. 1425102. DOI: 10.1155/2017/1425102.
    [6]
    Hammoud M, Sakr M F. Locality-aware reduce task scheduling for MapReduce. In Proc. the 3rd IEEE International Conference on Cloud Computing Technology and Science, Nov. 29-Dec. 1, 2011, pp.570-576. DOI: 10.1109/CloudCom.2011.87.
    [7]
    Dean J, Ghemawat S. MapReduce: Simplified data processing on large clusters. Commun. ACM, 2008, 51(1): 107-113. DOI: 10.1145/1327452.1327492.
    [8]
    Ahmad F, Lee S, Thottethodi M, Vijaykumar T. PUMA: Purdue MapReduce benchmarks suite. Technical Report, Purdue University, 2012. https://engineering.purdue.edu/∼puma/puma.pdf, May 2022.
    [9]
    Graham R L. Bounds for certain multiprocessing anomalies. The Bell System Technical Journal, 1966, 45(9): 1563-1581. DOI: 10.1002/j.1538-7305.1966.tb01709.x.
    [10]
    Mosharaf C, Ion S. Coflow: A networking abstraction for cluster applications. In Proc. the 11th ACM Workshop on Hot Topics in Networks, Oct. 2012, pp.31-36. DOI: 10.1145/2390231.2390237.
    [11]
    Kwon Y, Balazinska M, Howe B, Rolia J. SkewTune: Mitigating skew in MapReduce applications. In Proc. the 2012 ACM SIGMOD International Conference on Management of Data, May 2012, pp.25-36. DOI: 10.1145/2213836.2213840.
    [12]
    Chowdhury M, Zhong Y, Stoica I. Efficient coflow scheduling with Varys. ACM SIGCOMM Comput. Commun. Rev., 2014, 44(4): 443-454. DOI: 10.1145/2740070.2626315.
    [13]
    Chowdhury M, Stoica I. Efficient coflow scheduling without prior knowledge. In Proc. the 2015 ACM Conference on Special Interest Group on Data Communication, Aug. 2015, pp.393-406. DOI: 10.1145/2785956.2787480.
    [14]
    Mao H, Schwarzkopf M, Venkatakrishnan S B, Meng Z, Alizadeh M. Learning scheduling algorithms for data processing clusters. In Proc. the ACM Special Interest Group on Data Communication, Aug. 2019, pp.270-288. DOI: 10.1145/3341302.3342080.
    [15]
    Nguyen K, Wang K, Bu Y, Fang L, Hu J, Xu G. FACADE: A compiler and runtime for (almost) object-bounded big data applications. In Proc. the 20th International Conference on Architectural Support for Programming Languages and Operating Systems, Mar. 2015, pp.675-690. DOI: 10.1145/2694344.2694345.
    [16]
    Nguyen K, Fang L, Xu G, Demsky B, Lu S, Alamian S, Mutlu O. Yak: A high-performance big-data-friendly garbage collector. In Proc. the 12th USENIX Symposium on Operating Systems Design and Implementation, November 2016, pp.349-365.
    [17]
    Rasmussen A, Lam V T, Conley M, Porter G, Kapoor R, Vahdat A. Themis: An I/O-efficient MapReduce. In Proc. the 3rd ACM Symposium on Cloud Computing, Oct. 2012, Article No. 13. DOI: 10.1145/2391229.2391242.
    [18]
    Rao S, Ramakrishnan R, Silberstein A, Ovsiannikov M, Reeves D. Sailfish: A framework for large scale data processing. In Proc. the 3rd ACM Symposium on Cloud Computing, Oct. 2012, Article No. 4. DOI: 10.1145/2391229.2391233.
    [19]
    Zhang H, Cho B, Seyfe E, Ching A, Freedman M J. Riffle: Optimized shuffle service for large-scale data analytics. In Proc. the 13th EuroSys Conference, Apr. 2018, Article No. 43. DOI: 10.1145/3190508.3190534.
    [20]
    Zaharia M, Borthakur D, Sarma S J, Elmeleegy K, Shenker S, Stoica I. Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. In Proc. the 5th European Conference on Computer Systems, Apr. 2010, pp.265-278. DOI: 10.1145/1755913.1755940.
    [21]
    Ibrahim S, Jin H, Lu L, He B, Antoniu G, Wu S. Maestro: Replica-aware map scheduling for MapReduce. In Proc. the 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2012, pp.435-442. DOI: 10.1109/CCGrid.2012.122.
    [22]
    Tang Z, Jiang L, Zhou J, Li K, Li K. A self-adaptive scheduling algorithm for reduce start time. Future Generation Computer Systems, 2015, 43/44: 51-60. DOI: 10.1016/j.future.2014.08.011.
    [23]
    Ibrahim S, Jin H, Lu L, Wu S, He B, Qi L. LEEN: Locality/fairness-aware key partitioning for MapReduce in the cloud. In Proc. the 2nd IEEE International Conference on Cloud Computing Technology and Science, Nov. 30-Dec. 3, 2010, pp.17-24. DOI: 10.1109/CloudCom.2010.25.
    [24]
    Tan J, Meng X, Zhang L. Coupling task progress for MapReduce resource-aware scheduling. In Proc. the 2013 IEEE INFOCOM, Apr. 2013, pp.1618-1626. DOI: 10.1109/INFCOM.2013.6566958.
  • Related Articles

    [1]Chong Zhang, Zi-Tong Su, Min Li, Hui-Yuan Li, Wen-Jing Ma, Lei-Sheng Li. Optimization of Generalized Eigensolver for Dense Symmetric Matrices on AMD GPU[J]. Journal of Computer Science and Technology, 2025, 40(3): 855-869. DOI: 10.1007/s11390-024-3673-8
    [2]Jason Liu, Pedro Espina, Xian-He Sun. A Study on Modeling and Optimization of Memory Systems[J]. Journal of Computer Science and Technology, 2021, 36(1): 71-89. DOI: 10.1007/s11390-021-0771-8
    [3]Lan Huang, Da-Lin Li, Kang-Ping Wang, Teng Gao, Adriano Tavares. A Survey on Performance Optimization of High-Level Synthesis Tools[J]. Journal of Computer Science and Technology, 2020, 35(3): 697-720. DOI: 10.1007/s11390-020-9414-8
    [4]Yun-Cong Zhang, Xiao-Yang Wang, Cong Wang, Yao Xu, Jian-Wei Zhang, Xiao-Dong Lin, Guang-Yu Sun, Gong-Lin Zheng, Shan-Hui Yin, Xian-Jin Ye, Li Li, Zhan Song, Dong-Dong Miao. Bigflow: A General Optimization Layer for Distributed Computing Frameworks[J]. Journal of Computer Science and Technology, 2020, 35(2): 453-467. DOI: 10.1007/s11390-020-9702-3
    [5]Zheng-Hao Jin, Haiyang Shi, Ying-Xin Hu, Li Zha, Xiaoyi Lu. CirroData: Yet Another SQL-on-Hadoop Data Analytics Engine with High Performance[J]. Journal of Computer Science and Technology, 2020, 35(1): 194-208. DOI: 10.1007/s11390-020-9536-z
    [6]Qi Chen, Kang Chen, Zuo-Ning Chen, Wei Xue, Xu Ji, Bin Yang. Lessons Learned from Optimizing the Sunway Storage System for Higher Application I/O Performance[J]. Journal of Computer Science and Technology, 2020, 35(1): 47-60. DOI: 10.1007/s11390-020-9798-5
    [7]Min Li, Chao Yang, Qiao Sun, Wen-Jing Ma, Wen-Long Cao, Yu-Long Ao. Enabling Highly Efficient k-Means Computations on the SW26010 Many-Core Processor of Sunway TaihuLight[J]. Journal of Computer Science and Technology, 2019, 34(1): 77-93. DOI: 10.1007/s11390-019-1900-5
    [8]Yu-Geng Song, Hui-Min Cui, Xiao-Bing Feng. Parallel Incremental Frequent Itemset Mining for Large Data[J]. Journal of Computer Science and Technology, 2017, 32(2): 368-385. DOI: 10.1007/s11390-017-1726-y
    [9]Yu-Xiang Wang, Jun-Zhou Luo, Ai-Bo Song, Fang Dong. Partition-Based Online Aggregation with Shared Sampling in the Cloud[J]. Journal of Computer Science and Technology, 2013, 28(6): 989-1011. DOI: 10.1007/s11390-013-1393-6
    [10]Ying-Jie Shi, Xiao-Feng Meng, Fusheng Wang, Yan-Tao Gan. HEDC++:An Extended Histogram Estimator for Data in the Cloud[J]. Journal of Computer Science and Technology, 2013, 28(6): 973-988. DOI: 10.1007/s11390-013-1392-7
  • Others

  • Cited by

    Periodical cited type(2)

    1. Wanchun Dou, Xiaolong Xu, Shui Yu. Intelligent Industrial Internet Systems. DOI:10.1007/978-981-99-5732-3_2
    2. Qingyuan Hu, Tao Tao, Yu Liu, et al. Proceedings of the 5th International Conference on Big Data Analytics for Cyber-Physical System in Smart City—Volume 1. Lecture Notes on Data Engineering and Communications Technologies, DOI:10.1007/978-981-96-0208-7_6

    Other cited types(0)

Catalog

    Article views (138) PDF downloads (0) Cited by(2)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return