›› 2010, Vol. 25 ›› Issue (4): 864-873.doi: 10.1007/s11390-010-1067-6

• Architecture and High Performance Computer Systems • Previous Articles     Next Articles

Queue Waiting Time Aware Dynamic Workflow Scheduling in Multicluster Environments

Zhi-Feng Yu(余志峰) and Wei-Song Shi(施巍松), Senior Member, IEEE   

  1. Department of Computer Science, Wayne State University, Detroit, MI 48202, U.S.A.
  • Received:2009-08-01 Revised:2010-03-17 Online:2010-07-09 Published:2010-07-09
  • About author:
    Zhi-Feng Yu received the B.S. degree in applied mathematics and M.S. degree in economics and management from Tongji University, Shanghai, China, in 1990 and 1993 respectively and the Ph.D. degree in computer science from Wayne State University, Detroit, USA in 2009. He is currently Manager of Application Development and Integration of TSystems North America, Rochester Hills, MI. His research interests include computer systems and enterprise computing.
    Wei-Song Shi is an associate professor of computer science at Wayne State University. He received his B.S. degree from Xidian University in 1995, and Ph.D. degree from the Chinese Academy of Sciences in 2000, both in computer engineering. His current research focuses on mobile computing, distributed systems and high performance computing. Dr. Shi has published more than 80 peer-reviewed journal and conference papers in these areas. He is the author of the book “Performance Optimization of Software Distributed Shared Memory Systems” (Higher Education Press, 2004). He has also served on technical program committees of several international conferences, including WWW, ICPP, MASS. He is a recipient of Microsoft Fellowship in 1999, the President Outstanding Award of the Chinese Academy of Sciences in 2000, one of 100 Outstanding Ph.D. Dissertations (China) in 2002, “Faculty Research Award” of Wayne State University in 2004 and 2005, the “Best Paper Award” of ICWE’04 and IPDPS’05. He is a recipient of the NSF CAREER award andWayne State University Career Development Chair award.
  • Supported by:

    This work is in part supported by the US National Science Foundation CAREER Grant No. CCF-0643521.

Workflows are prevailing in scientific computation. Multicluster environments emerge and provide more resources, benefiting workflows but also challenging the traditional workflow scheduling heuristics. In a multicluster environment, each cluster has its own independent workload management system. Jobs are queued up before getting executed, they experience different resource availability and wait time if dispatched to different clusters. However, existing scheduling heuristics neither consider the queue wait time nor balance the performance gain with data movement cost. The proposed algorithm leverages the advancement of queue wait time prediction techniques and empirically studies if the tunability of resource requirements helps scheduling. The extensive experiment with both real workload traces and test bench shows that the queue wait time aware algorithm improves workflow performance by 3 to 10 times in terms of average makespan with relatively very low cost of data movement.


[1] NSF teragrid. http://www.teragrid.org/.

[2] Li H, Groep D, Wolters L. Workload characteristics of a multi-cluster supercomputer. In Proc. the 10th International Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP,2004), New York, USA, June 13, 2004, pp.176-193.

[3] Nurmi D, Brevik J, Wolski R. QBETS: Queue bounds estimation from time series. In Proc. the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS,2007), San Diego, USA, June 12-16, 2007, pp.379-380.

[4] QBETS web service. http://spinner.cs.ucsb.edu/batchq/.

[5] Aida K, Casanova H. Scheduling mixed-parallel applications with advance reservations. In Proc. the 17th International Symposium on High-Performance Distributed Computing (HPDC\,2008), Boston, USA, June 23-27, 2008, pp.65-74.

[6] N'Takpe T, Suter F, Casanova H. A comparison of scheduling approaches for mixed-parallel applications on heterogeneous platforms. In Proc. the Sixth International Symposium on Parallel and Distributed Computing (ISPDC,2007), Hagenberg, Austria, July 5-8, 2007, p.35.

[7] Rauber T, Runger G. M-task-programming for heterogeneous systems and grid environments. In Proc. 19th International Parallel and Distributed Processing Symposium (IPDPS,2005), Los Alamitos, USA, April 4-8, 2005, p.178b.

[8] He L, Jarvis S, Spooner D, Nudd G. Performance evaluation of scheduling applications with DAG topologies on multiclusters with independent local schedulers. In Proc. the 20th International Parallel and Distributed Processing Symposium (IPDPS,2006), Rhodes Island, Greece, April 25-29, 2006, pp.8-15.

[9] Yu Z, Shi W. An adaptive rescheduling strategy for grid workflow applications. In Proc. the 21st International Parallel and Distributed Processing Symposium (IPDPS,2007), Long Beach, USA, March 26-30, 2007, p.115.

[10] Yu Z, Shi W. A planner-guided scheduling strategy for multiple workflow applications. In Proc. the Fourth International Workshop on Scheduling and Resource Management for Parallel and Distributed Systems, in conjunction with ICPP 2008, Portland, USA, Sept. 8-12, 2008, pp.1-8.

[11] Topcuouglu H, Hariri S, Wu M. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Transactions on Parallel and Distribution Systems, 2002, 13(3): 260-274.

[12] The Globus Alliance. http://www.globus.org/.

[13] Huang R, Casanova H, Chien A. Automatic resource specification generation for resource selection. In Proc. the 2007 ACM/IEEE Conference on Supercomputing (SC\,2007), Reno, USA, Nov. 10-16, 2007, pp.1-11.

[14] Hill M, Marty M. Amdahl's law in the multicore era. Computer, 2008, 41(7): 33-38.

[15] Hõnig U, Schiffmann W. A comprehensive test bench for the evaluation of scheduling heuristics. In Proc. the 16th International Conference on Parallel and Distributed Computing and Systems (PDCS,2004), Cambridge, USA, Nov. 9-11, 2004, pp.437-442.

[16] Sulistio A, Cibej U, Venugopal S, Robic B, Buyya R. A toolkit for modelling and simulating data grids: An extension to gridsim. Concurr. Comput.: Pract. Exper., 2008, 20(13): 1591-1609.

[17] Portable Batch System. http://www.openpbs.org, Dec. 2008.

[18] Sabin G, Kettimuthu R, Rajan A, Sadayappan P. Scheduling of parallel jobs in a heterogeneous multi-site environment. In the 9th International Workshop of Job Scheduling Strategies for Parallel Processing (JSSPP,2003), Seattle, USA, June 24, 2003, pp.87-104.

[19] Yu Z. Toward practical multi-workflow scheduling in cluster and grid environments
[Ph.D. Dissertation]. Wayne State University, 2008. %scheduling applications with dag topologies on multiclusters with%Symposium of Parallel and Distributed Processing Symposium (IPDPS,2006),

[20] Shivaratri N, Krueger P, Singhal M. Load distributing for locally distributed systems. Computer, 1992, 25(12): 33-44.

[21] Maheswaran M, Ali S, Siegel H, Hensgen D, Freund R. Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J. Parallel Distrib. Comput., 1999, 59(2): 107-131.

[22] Hunold S, Rauber T, Runger G. Dynamic scheduling of multi-processor tasks on clusters of clusters. In Proc. 2007 IEEE International Conference on Cluster Computing, Austin, USA, Sept. 17-21, 2007, pp.507-514.

[23] Hunold S, Rauber T, Suter F. Scheduling dynamic workflows onto clusters of clusters using postponing. In Proc. the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID,2008), Lyon, France, May 19-22, 2008, pp.669-674.

[24] Nurmi D, Mandal A, Brevik J, Koelbel C, Wolski R, Kennedy K. Evaluation of a workflow scheduler using integrated performance modelling and batch queue wait time prediction. In Proc. the 2006 ACM/IEEE Conference on Supercomputing (SC,2006), Tampa, USA, Nov. 11-17, 2006, Article No.119.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved