|
计算机科学技术学报 ›› 2023,Vol. 38 ›› Issue (1): 87-102.doi: 10.1007/s11390-023-2885-7
所属专题: 综述; Computer Architecture and Systems
Rong Ge1, Xizhou Feng2, Pengfei Zou3, and Tyler Allen4
现代计算机系统越来越受到从单个组件到数据中心的多层可用或允许功率的限制。为了应对这一现实,有必要了解功率界限如何影响性能,特别是对于从高端节点构建的系统,每个节点都包含多个耗电组件。由于在节点或组件上放置不适当的功率限制会导致严重的性能损失,因此在给定总功率预算的情况下,必须协调节点和组件之间的功率分配,从而实现所需的性能。在该篇论文中,作者描述了功率受限的高性能计算范式,该范式认为协调功率受限分配是计算机系统性能分析和优化的关键因素。作者将此范式应用于 CPU 和 GPU 计算的跨多层的功率协调问题。通过几个案例研究,作者展示了如何应用和平衡功率协调原则,并使其适应工作负载、硬件技术和可用总功率的相互作用,从而提高性能。
<table class="reference-tab" style="background-color:#FFFFFF;width:914.104px;color:#333333;font-family:Calibri, Arial, 微软雅黑, "font-size:16px;"> <tbody> <tr class="document-box" id="b1"> <td valign="top" class="td1"> [1] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Lucas R, Ang J, Bergman K et al. Top ten exascale research challenges. DOE Advanced Scientific Computing Advisory Subcommittee (ASCAC) Report, U.S. Department of Energy, Office of Science, 2014. DOI: <a href="http://dx.doi.org/10.2172/1222713.">10.2172/1222713</a>. </div> </td> </tr> <tr class="document-box" id="b2"> <td valign="top" class="td1"> [2] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Jeon M, Venkataraman S, Phanishayee A, Qian J J, Xiao W C, Yang F. Analysis of large-scale multi-tenant GPU clusters for DNN training workloads. In <i>Proc. the 2019 USENIX Annual Technical Conference</i>, Jul. 2019, pp.947-960. </div> </td> </tr> <tr class="document-box" id="b3"> <td valign="top" class="td1"> [3] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Ge R, Feng X Z, Allen T, Zou P F. The case for cross-component power coordination on power bounded systems. <i>IEEE Trans. Parallel and Distributed Systems</i>, 2021, 32(10): 2464-2476. DOI: <a href="http://dx.doi.org/10.1109/TPDS.2021.3068235">10.1109/TPDS.2021.3068235</a>. </div> </td> </tr> <tr class="document-box" id="b4"> <td valign="top" class="td1"> [4] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Ge R, Feng X Z, He Y Y, Zou P F. The case for cross-component power coordination on power bounded systems. In <i>Proc. the 45th International Conference on Parallel Processing (ICPP)</i>, Aug. 2016, pp.516-525. DOI: <a href="http://dx.doi.org/10.1109/ICPP.2016.66">10.1109/ICPP.2016.66</a>. </div> </td> </tr> <tr class="document-box" id="b5"> <td valign="top" class="td1"> [5] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Ge R, Zou P F, Feng X Z. Application-aware power coordination on power bounded NUMA multicore systems. In <i>Proc. the 46th International Conference on Parallel Processing (ICPP)</i>, Aug. 2017, pp.591-600. DOI: <a href="http://dx.doi.org/10.1109/ICPP.2017.68">10.1109/ICPP.2017.68</a>. </div> </td> </tr> <tr class="document-box" id="b6"> <td valign="top" class="td1"> [6] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Zou P F, Allen T, Davis C H, Feng X Z, Ge R. CLIP: Cluster-level intelligent power coordination for power-bounded systems. In <i>Proc. the 2017 IEEE International Conference on Cluster Computing (CLUSTER)</i>, Sept. 2017, pp.541-551. DOI: <a href="http://dx.doi.org/10.1109/CLUSTER.2017.98">10.1109/CLUSTER.2017.98</a>. </div> </td> </tr> <tr class="document-box" id="b7"> <td valign="top" class="td1"> [7] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Zou P F, Feng X Z, Ge R. Contention aware workload and resource co-scheduling on power-bounded systems. In <i>Proc. the 2019 IEEE International Conference on Networking, Architecture and Storage (NAS)</i>, Aug. 2019. DOI: <a href="http://dx.doi.org/10.1109/NAS.2019.8834721">10.1109/NAS.2019.8834721</a>. </div> </td> </tr> <tr class="document-box" id="b8"> <td valign="top" class="td1"> [8] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Zou P F, Rodriguez D, Ge R. Maximizing throughput on power-bounded HPC systems. In <i>Proc. the 2018 IEEE International Conference on Cluster Computing (CLUSTER)</i>, Sept. 2018, pp.156-157. DOI: <a href="http://dx.doi.org/10.1109/CLUSTER.2018.00030">10.1109/CLUSTER.2018.00030</a>. </div> </td> </tr> <tr class="document-box" id="b9"> <td valign="top" class="td1"> [9] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Eyerman S, Eeckhout L. System-level performance metrics for multiprogram workloads. <i>IEEE Micro</i>, 2008, 28(3): 42–53. DOI: <a class="mainColor ref-doi" href="http://dx.doi.org/10.1109/MM.2008.44" target="_blank">10.1109/MM.2008.44</a>. </div> </td> </tr> <tr class="document-box" id="b10"> <td valign="top" class="td1"> [10] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Blagodurov S, Zhuravlev S, Fedorova A. Contention-aware scheduling on multicore systems. <i>ACM Trans. Computer Systems</i>, 2010, 28(4): Article No. 8. DOI: <a href="http://dx.doi.org/10.1145/1880018.1880019">10.1145/1880018.1880019</a>. </div> </td> </tr> <tr class="document-box" id="b11"> <td valign="top" class="td1"> [11] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Subramanian L, Seshadri V, Ghosh A, Khan S, Mutlu O. The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory. In <i>Proc. the 48th Annual IEEE/ACM International Symposium on Microarchitecture</i>, Dec. 2015, pp.62-75. DOI: <a href="http://dx.doi.org/10.1145/2830772.2830803">10.1145/2830772.2830803</a>. </div> </td> </tr> <tr class="document-box" id="b12"> <td valign="top" class="td1"> [12] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Kelley J, Stewart C, Tiwari D, Gupta S. Adaptive power profiling for many-core HPC architectures. In <i>Proc. the 2016 IEEE International Conference on Autonomic Computing (ICAC)</i>, Jul. 2016, pp.179-188. DOI: <a href="http://dx.doi.org/10.1109/ICAC.2016.45">10.1109/ICAC.2016.45</a>. </div> </td> </tr> <tr class="document-box" id="b13"> <td valign="top" class="td1"> [13] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Mishra N, Lafferty J D, Hoffmann H. ESP: A machine learning approach to predicting application interference. In <i>Proc. the 2017 IEEE International Conference on Autonomic Computing (ICAC)</i>, Jul. 2017, pp.125-134. DOI: <a href="http://dx.doi.org/10.1109/ICAC.2017.29">10.1109/ICAC.2017.29</a>. </div> </td> </tr> </tbody> </table> |
No related articles found! |
|
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |