›› 2015,Vol. 30 ›› Issue (1): 42-56.doi: 10.1007/s11390-015-1503-8

所属专题: Computer Architecture and Systems

• Special Section on Selected Paper from NPC 2011 • 上一篇    下一篇

一个面向多线程程序的高效并且灵活的确定性软件框架

Kai Lu1,2(卢凯), Xu Zhou2(周旭), Xiao-Ping Wang1,2(王小平), Tom Bergan3, Chen Chen1,2(陈沉)   

  1. 1 Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology Changsha 410073, China;
    2 College of Computer, National University of Defense Technology, Changsha 410073, China;
    3 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195-2350, U.S.A.
  • 收稿日期:2014-07-14 修回日期:2014-11-26 出版日期:2015-01-05 发布日期:2015-01-05
  • 作者简介:Kai Lu received his B.S. and Ph.D. degrees in computer science in 1995 and 1999, respectively, from School of Computer Science, National University of Defense Technology, Changsha. He is now a professor in School of Computer Science, National University of Defense Technology. His research interests include operating system, parallel computing, and security.
  • 基金资助:

    The work was supported by the National Natural Science Foundation of China under Grant Nos. 61272142, 61103082, 61402492, 61170261, 61103193, the National High Technology Research and Development 863 Program of China under Grant Nos. 2012AA01A301, 2012AA010901, and the Program for New Century Excellent Talents in University of China.

An Efficient and Flexible Deterministic Framework for Multithreaded Programs

Kai Lu1,2(卢凯), Xu Zhou2(周旭), Xiao-Ping Wang1,2(王小平), Tom Bergan3, Chen Chen1,2(陈沉)   

  1. 1 Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology Changsha 410073, China;
    2 College of Computer, National University of Defense Technology, Changsha 410073, China;
    3 Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195-2350, U.S.A.
  • Received:2014-07-14 Revised:2014-11-26 Online:2015-01-05 Published:2015-01-05
  • About author:Kai Lu received his B.S. and Ph.D. degrees in computer science in 1995 and 1999, respectively, from School of Computer Science, National University of Defense Technology, Changsha. He is now a professor in School of Computer Science, National University of Defense Technology. His research interests include operating system, parallel computing, and security.
  • Supported by:

    The work was supported by the National Natural Science Foundation of China under Grant Nos. 61272142, 61103082, 61402492, 61170261, 61103193, the National High Technology Research and Development 863 Program of China under Grant Nos. 2012AA01A301, 2012AA010901, and the Program for New Century Excellent Talents in University of China.

确定性对于多线程程序的调试、测试等方面非常有用.为了实现确定性,学术界提出了诸如确定性多线程和确定性回放等方法.然而,这些方法要么效率较低,要么只针对一种应用场景,导致不够灵活.本文针对多线程程序提出了一个高效且灵活的确定性框架.该框架分两步实现确定性:首先是放松确定性,然而是确定性.放松确定性通过设计一种合理的内存一致性来高效的解决数据竞争.在此基础上,我们以确定的顺序来解决同步竞争,从而最终实现强确定性.由于我们可以使用独立的算法来分别实现这两步,我们的框架可以提供一组不同程度的确定性策略,包括非确定性系统(最高效)、弱确定性系统(高效但不完全确定)、确定性多线程系统和确定性回放系统.实验显示当被配置成确定性多线程系统时,该框架的性能要由于目前经典的确定性算法.

Abstract: Determinism is very useful to multithreaded programs in debugging, testing, etc. Many deterministic approaches have been proposed, such as deterministic multithreading (DMT) and deterministic replay. However, these systems either are inefficient or target a single purpose, which is not flexible. In this paper, we propose an efficient and flexible deterministic framework for multithreaded programs. Our framework implements determinism in two steps: relaxed determinism and strong determinism. Relaxed determinism solves data races efficiently by using a proper weak memory consistency model. After that, we implement strong determinism by solving lock contentions deterministically. Since we can apply different approaches for these two steps independently, our framework provides a spectrum of deterministic choices, including nondeterministic system (fast), weak deterministic system (fast and conditionally deterministic), DMT system, and deterministic replay system. Our evaluation shows that the DMT configuration of this framework could even outperform a state-of-the-art DMT system.

[1] Lee E A. The problem with threads. Computer, 2006, 39(5):33-42.

[2] Olszewski M, Ansel J, Amarasinghe S. Kendo: Efficient deterministic multithreading in software. In Proc. the 14th ASPLOS, March 2009, pp.97-108.

[3] Devietti J, Lucia B, Ceze L, Oskin M. DMP: Deterministic shared memory multiprocessing. In Proc. the 14th ASPLOS, March 2009, pp.85-96.

[4] Aviram A, Weng S C, Hu S, Ford B. Efficient systemenforced deterministic parallelism. In Proc. the 9th OSDI, October 2010, pp.193-206.

[5] Veeraraghavan K, Lee D, Wester B, Ouyang J, Chen P M, Flinn J, Narayanasamy S. DoublePlay: Parallelizing sequential logging and replay. ACM Transactions on Computer Systems, 2012, 30(1):3:1-3:24.

[6] Bergan T, Anderson O, Devietti J, Ceze L, Grossman D. CoreDet: A compiler and runtime system for deterministic multithreaded execution. In Proc. the 15th ASPLOS, March 2010, pp.53-64.

[7] Bergan T, Hunt N, Ceze L, Gribble S D. Deterministic process groups in DOS. In Proc. the 9th OSDI, October 2010, pp.177-191.

[8] Hunt N, Bergan T, Ceze L, Gribble S D. DDOS: Taming nondeterminism in distributed systems. In Proc. the 18th ASPLOS, March 2013, pp.499-508.

[9] Bergan T, Devietti J, Hunt N, Ceze L. The deterministic execution hammer: How well does it actually pound nails? In Proc. the 2nd Workshop on Determinism and Correctness in Parallel Programming, March 2011, pp.56-63.

[10] Lu K, Zhou X, Bergan T, Wang X. Efficient deterministic multithreading without global barriers. In Proc. the 19th PPoPP, February 2014, pp.287-300.

[11] Cui H, Simsa J, Lin Y H, Li H, Blum B, Xu X, Yang J, Gibson G A, Bryant R E. PARROT: A practical runtime for deterministic, stable, and reliable threads. In Proc. the 24th SOSP, November 2013, pp.388-405.

[12] Chen Y, Chen H. Scalable deterministic replay in a parallel full-system emulator. SIGPLAN Not., 2013, 48(8):207-218.

[13] Cui H, Wu J, Tsai C C, Yang J. Stable deterministic multithreading through schedule memoization. In Proc. the 9th OSDI, October 2010, pp.207-221.

[14] Boehm H J, Adve S V. Foundations of the C++ concurrency memory model. In Proc. the 29th PLDI, June 2008, pp.68-78.

[15] Adve S V, Aggarwal J K. A unified formalization of four shared-memory models. IEEE Trans. Parallel Distrib. Syst., 1993, 10(4):613-624.

[16] Keleher P, Cox A L, Zwaenepoel W. Lazy release consistency for software distributed shared memory. In Proc. the 19th ISCA, May 1992, pp.13-21.

[17] Keleher P, Cox A L, Dwarkadas S, Zwaenepoel W. Treadmarks: Distributed shared memory on standard workstations and operating systems. In Proc. the 1994 USENIX Winter Technical Conf., November 1994, pp.10:1-10:17.

[18] Liu T, Curtsinger C, Berger E D. Dthreads: Efficient deterministic multithreading. In Proc. the 23rd SOSP, October 2011, pp.327-336.

[19] Fidge C J. Partial orders for parallel debugging. In Proc. the 1988 ACM SIGPLAN and SIGOPS Workshop on Parallel and Distributed Debugging, May 1988, pp.183-194.

[20] Berger E D, Zorn B G, McKinley K S. Composing highperformance memory allocators. In Proc. the 2001 PLDI, June 2001, pp.114-124.

[21] Xiong W, Park S, Zhang J, Zhou Y, Ma Z. Ad hoc synchronization considered harmful. In Proc. the 9th OSDI, October 2010, pp.163-176.

[22] Adve S V, Hill M D. Weak ordering—A new definition. SIGARCH Comput. Archit. News, 1990, 18(2):2-14.

[23] Weaver V, McKee S. Can hardware performance counters be trusted? In Proc. IEEE Int. Symp. Workload Characterization, September 2008, pp.141-150.

[24] Bienia C, Kumar S, Singh J P, Li K. The PARSEC benchmark suite: Characterization and architectural implications. In Proc. the 17th Int. Conf. Parallel Architectures and Compilation Techniques, October 2008, pp.72-81.

[25] Woo S C, Ohara M, Torrie E, Singh J P, Gupta A. The SPLASH-2 programs: Characterization and methodological considerations. In Proc. the 22nd ISCA, June 1995, pp.24-36.

[26] Ranger C, Raghuraman R, Penmetsa A, Bradski G, Kozyrakis C. Evaluating MapReduce for multi-core and multiprocessor systems. In Proc. the 13th HPCA, February 2007, pp.13-24.

[27] Devietti J, Nelson J, Bergan T, Ceze L, Grossman D. RCDC: A relaxed consistency deterministic computer. In Proc. the 16th ASPLOS, March 2011, pp.67-78.

[28] Hower D, Dudnik P, Hill M, Wood D. Calvin: Deterministic or not? Free will to choose. In Proc. the 17th HPCA, February 2011, pp.333-334.

[29] Merrifield T, Eriksson J. Conversion: Multi-version concurrency control for main memory segments. In Proc. the 8th ACM European Conf. Computer Systems, April 2013, pp.127-139.

[30] Zhou X, Lu K, Wang X, Li X. Exploiting parallelism in deterministic shared memory multiprocessing. Journal of Parallel and Distributed Computing, 2012, 72(5):716-727.

[31] Cui H, Wu J, Gallagher J, Guo H, Yang J. Efficient deterministic multithreading through schedule relaxation. In Proc. the 23rd SOSP, October 2011, pp.337-351.

[32] Bergan T, Ceze L, Grossman D. Input-covering schedules for multithreaded programs. In Proc. the 2013 ACM SIGPLAN Int. Conf. Object Oriented Programming Systems Languages and Applications, October 2013, pp.677-692.

[33] LeBlanc T J, Mellor-Crummey J M. Debugging parallel programs with instant replay. IEEE Trans. Comput., 1987, 36(4):471-482.

[34] Lee D, Chen P M, Flinn J, Narayanasamy S. Chimera: Hybrid program analysis for determinism. In Proc. the 33rd PLDI, June 2012, pp.463-474.

[35] Subhraveti D, Nieh J. Record and Transplay: Partial checkpointing for replay debugging across heterogeneous systems. In Proc. ACM Int. Conf. Measurement and Modeling of Computer Systems, June 2011, pp.109-120.

[36] Lu K, Zhou X, Wang X, Zhang W, Li G. RaceFree: An efficient multi-threading model for determinism. SIGPLAN Not., 2013, 48(8):297-298.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 张钹; 张铃;. Statistical Heuristic Search[J]. , 1987, 2(1): 1 -11 .
[2] 孟力明; 徐晓飞; 常会友; 陈光熙; 胡铭曾; 李生;. A Tree-Structured Database Machine for Large Relational Database Systems[J]. , 1987, 2(4): 265 -275 .
[3] 林琦; 夏培肃;. The Design and Implementation of a Very Fast Experimental Pipelining Computer[J]. , 1988, 3(1): 1 -6 .
[4] 孙成政; 慈云桂;. A New Method for Describing the AND-OR-Parallel Execution of Logic Programs[J]. , 1988, 3(2): 102 -112 .
[5] 张钹; 张恬; 张建伟; 张铃;. Motion Planning for Robots with Topological Dimension Reduction Method[J]. , 1990, 5(1): 1 -16 .
[6] 王鼎兴; 郑纬民; 杜晓黎; 郭毅可;. On the Execution Mechanisms of Parallel Graph Reduction[J]. , 1990, 5(4): 333 -346 .
[7] 周权; 魏道政;. A Complete Critical Path Algorithm for Test Generation of Combinational Circuits[J]. , 1991, 6(1): 74 -82 .
[8] 赵靓海; 刘慎权;. An Environment for Rapid Prototyping of Interactive Systems[J]. , 1991, 6(2): 135 -144 .
[9] 商陆军; 许立辉;. Notes on the Design of an Integrated Object-Oriented DBMS Family[J]. , 1991, 6(4): 389 -394 .
[10] 许建国; 郭玉钗; 林宗楷;. HEPAPS:A PCB Automatic Placement System[J]. , 1992, 7(1): 39 -46 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: