We use cookies to improve your experience with our site.

Energy Efficient Block-Partitioned Multicore Processors for Parallel Applications

  • 摘要: 由于当代计算机系统的电能消耗不断的增加,在过去的几十年里,计算机系统的能耗管理已经成为一个非常重要的领域。随着多核处理器的普及,多核计算机系统为提高能耗效率提供了一个很好的平台。然而,由于集成在一个芯片上的处理单元不断增多,对于多核计算机系统有效的能源管理已经成为一个非常重大的课题。一方面,如果芯片上所有处理核共享一个供电单元,那么这些处理单元必须运行于相同的电压/频率,并不能被随意转换入低能耗休眠状态。这种设计极大的限制了多核计算机系统的能耗管理灵活度。另一方面,如果给每个处理单元提供单独的供电单元来最大化多核计算机系统的能耗管理灵活度,这种结构会给芯片带来相当多了额外负担。在本篇论文里,我们研究基于电压岛和电压/频率动态调节的块划分多核计算机系统的能耗效率。在基于块划分的多核计算机系统中,处理单元首先被分配到多个运行块中,每一个运行块有单独的供电单元并可以单独调节电压和频率。考虑运行块上的处理单元个数不同,我们研究了对称的块划分结构以及不对称的块划分结构。其中,对称块划分结构中每个运行块有相同个数的处理单元,而不对称块划分结构每个运行块可以有不同个数的处理单元。对于不对称块划分结构,我们提出一种特殊结伴(buddy)非对称块划分结构,其中运行块的个数与处理单元的个数成对数关系。此外,针对这种基于块划分得多核计算机系统,我们还提出了系统级的能耗模型,此模型考虑了并可以支持各种能源管理技术,诸如电压/频率动态调节以及低能耗运行状态。我们还提出并计算了块级别以及系统级别的最佳能耗运行频率。根据能耗模型,我们证明,对于不受限制可任意并行的应用程序,单运行块集成所有处理单元的划分结构与每个处理单元占有一个运行块的划分结构会节省相同的能耗。然而,对于不可任意并行的应用程序,我们证明结伴非对称块划分结构会节省更多的能耗。结伴非对称块结构可以获得与每处理核占用一运行块结构相同的能耗。此外,考虑对称块划分结构以及不对称块划分结构,我们实现并进行了大量模拟实验来研究不同块划分结构的能耗效率。本篇文章的贡献可以总结如下:
    (1) 我们提出了块划分的结构来划分一个芯片上所有的处理单元。我们考虑了对称块划分结构以及非对称块划分结构,并提出一种特殊结伴(buddy)非对称块划分结构,其中运行块的个数与处理核的个数成对数关系。
    (2) 考虑基于块划分的多核计算机结构,我们提出了系统级的能耗模型。我们还分析并计算了最佳能耗运行频率;
    (3) 针对可任意并行以及非任意并行的应用程序,我们研究基于块划分的多核计算机系统的能耗效率,考虑了对称以及非对称的块划分结构。通过理论分析以及大量模拟试验,我们得出结伴非对称块划分结构的多核计算机系统具有很好的能耗效率。

     

    Abstract: Due to the increasing power consumption in modern computing systems, energy management has become an important research area in the last decade. Recently, multicore has emerged to be an energy efficient architecture that exploits parallelisms in modern applications. However, as the number of cores on a single chip continues to increase, it has been a grand challenge on how to effectively manage the energy efficiency of multicore-based systems. In this paper, based on the voltage island and dynamic voltage and frequency scaling (DVFS) techniques, we investigate the energy efficiency of block-partitioned multicore processors, where cores are grouped into blocks with the cores on one block sharing a DVFSenabled power supply. Depending on the number of cores on each block, we study both symmetric and asymmetric block configurations. We develop a system-level power model (which can support various power management techniques) and derive both block- and system-wide energy-efficient frequencies for systems with block-partitioned multicore processors. Based on the power model, we prove that, for embarrassingly parallel applications, having all cores on a single block can achieve the same energy savings as that of the individual block configuration (where each core forms a single block and has its own power supply). However, for applications with limited degrees of parallelism, we show the superiority of the buddy-asymmetric block configuration, where the number of required blocks (and power supplies) is logarithmically related to the number of cores on the chip, in that it can achieve the same amount of energy savings as that of the individual block configuration. The energy efficiency of different block configurations is further evaluated through extensive simulations with both synthetic as well as a real life application.

     

/

返回文章
返回