Abstract Thread level speculation provides not only a simple parallel programming model, but also an effective mechanism for thread-level parallelism exploitation. The performance of software speculative parallel models is limited by high global overheads caused by different types of loops. These loops usually have different characteristics of dependencies and different requirements of optimization strategies. In this paper, we propose three comprehensive optimization techniques to reduce different factors of global overheads, aiming at requirements from different types of loops. Inter-thread fetching can reduce the high mis-speculation rate of the loops with frequent dependencies and out-of-order committing can reduce the control overhead of the loops with infrequent dependencies, while enhanced dynamic task granularity resizing can reduce the control overhead and optimize the global overhead of the loops with changing characteristics of dependencies. All these three optimization techniques have been implemented in HEUSPEC, a software TLS system. Experimental results indicate that they can satisfy the demands from different groups of benchmarks. The combination of these techniques can improve the performance of all benchmarks and reach a higher average speedup.
This work was supported by the National High Technology Research and Development 863 Program of China under Grant No. 2012AA010905 and the National Natural Science Foundation of China under Grant Nos. 61272143 and 61472431.
About author: Li Shen received his B.S., M.S. and Ph.D. degrees in computer science and technology from the National University of Defense Technology (NUDT), Changsha, in 1997, 2000, and 2003, respectively. Currently, he is an associate professor of the School of Computer, NUDT. His research interests include programming model and compiler design, high performance processor architecture, virtualization technologies, and performance evaluation and workload characterization. He is a member of CCF and ACM.
Cite this article:
Li Shen, Fan Xu, Zhi-Ying Wang.Optimization Strategies Oriented to Loop Characteristics in Software Thread Level Speculation Systems[J] Journal of Computer Science and Technology, 2016,V31(1): 60-76
 Tian C, Feng M, Nagarajan V, Gupta R. Copy or discard execution model for speculative parallelization on multicores. In Proc. the 41st Annual IEEE/ACM Int. Symp. Microarchitecture, Nov. 2008, pp.330-341. Ding C, Shen X, Kelsey K, Tice C, Huang R, Zhang C. Software behavior oriented parallelization. In Proc. the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation, Jun. 2007, pp.223-234. Xu F, Shen L, Wang Z, Guo H, Su B, Chen W. HEUSPEC:A software speculation parallel model. In Proc. the 42nd International Conference on Parallel Processing, Oct. 2013, pp.621-630. Liu S, Eisenbeis C, Gaudiot J L. Speculative execution on GPU:An exploratory study. In Proc. the 39th International Conference on Parallel Processing, Sept. 2010, pp.453-461. Tian C, Lin C, Feng M, Gupta R. Enhanced speculative parallelization via incremental recovery. In Proc. the 16th ACM Symposium on Principles and Practice of Parallel Programming, Feb. 2011, pp.189-200. Moore K, Bobba J, Moravan M J, Hill M, Wood D. LogTM:Log-based transactional memory. In Proc. the 12th International Symposium on High-Performance Computer Architecture, Feb. 2006, pp.254-265. Tian C, Feng M, Gupta R. Speculative parallelization using state separation and multiple value prediction. In Proc. the 9th International Symposium on Memory Management, June 2010, pp.63-72. Che S, Boyer M, Meng J et al. Rodinia:A benchmark suite for heterogeneous computing. In Proc. IEEE International Symposium on Workload Characterization, Oct. 2009, pp.44-54. Henning J L. Spec CPU2000:Measuring CPU performance in the new millennium. Computer, 2000, 33(7):28-35. Spradling C D. Spec CPU2006 benchmark tools. ACM SIGARCH Comput. Archit. News, 2007, 35(1):130-134. Bienia C, Kumar S, Singh J P, Li K. The PARSEC benchmark suite:Characterization and architectural implications. In Proc. the 17th International Conference on Parallel Architectures and Compilation Techniques, Oct. 2008, pp.72-81. Rodriguez C, de Sande F. The OpenMP source code repository. In Proc. the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing, Feb. 2005, pp.244-250. Ke C, Liu L, Zhang C, Bai T, Jacobs B, Ding C. Safe parallel programming using dynamic dependence hints. ACM SIGPLAN Not., 2011, 46(10):243-258.