一种基于循环级并行的静态贪心和动态自适应线程激发方法

李美蓉; 赵银亮; 陶悠; 王启明

doi:10.1007/s11390-014-1482-1

一种基于循环级并行的静态贪心和动态自适应线程激发方法

A Static Greedy and Dynamic Adaptive Thread Spawning Approach for Loop-Level Parallelism

摘要

摘要: 线程级推测技术,即从非规则串行程序中探索潜在的线程级并行,变得越来引人注目.通常地,推测线程很难达到期望的并行性能.其原因归结于推测线程性能本身非常复杂.事实上,它不但受到因模糊的控制和数据依赖造成的编译时性能估计的不准确性影响,而且依赖于潜在的硬件配置和程序行为.因此,本文提出一种基于循环级推测的静态贪心和动态自适应的方法,实现运行时动态确定最好的并行循环层.这种方法主要依靠编译器进行贪心地选择和优化所有的循环候选对象,然后对所选择的循环进行基于循环嵌套层的代价和性能分析,从而确定循环相互推测次序.在运行时性能预测作用下,我们再次对其进行动态调度和更新循环的推测次序,确保最好的循环嵌套层总能得到并行.此外,两种不同的循环调度策略也得到了验证,从而实现最大化整体并行性能.同传统的静态循环选择技术相比,该方法能够达到或者比其更好的性能.

Abstract: Thread-level speculation becomes more attractive for the exploitation of thread-level parallelism from irregular sequential applications. But it is common for speculative threads to fail to reach the expected parallel performance. The reason is that the performance of speculative threads is extremely complicated by the fact that it not only suffers from the imprecision of compiler-directed performance estimation due to ambiguous control and data dependences, but also depends on the underlying hardware configuration and program behaviors. Thus, this paper proposes a statically greedy and dynamically adaptive approach for loop-level speculation to dynamically determine the best loop level at runtime. It relies on the compiler to select and optimize all loop candidates greedily, which are then proceeded on the cost-benefit analysis of different loop nesting levels for the determination of the order of loop speculation. Under the runtime loop execution prediction, we dynamically schedule and update the order of loop speculation, and ensure the best loop level to be always parallelized. Two different policies are also examined to maximize overall performance. Compared with traditional static loop selection techniques, our approach can achieve comparable or better performance.

HTML全文

参考文献()

施引文献

资源附件()