|
计算机科学技术学报 ›› 2023,Vol. 38 ›› Issue (1): 80-86.doi: 10.1007/s11390-022-2950-7
所属专题: 综述; Computer Architecture and Systems
Kirk W. Cameron, Fellow, IEEE, Distinguished Member, ACM
Amdahl定律(阿姆达尔定律)意味着并行性是有限的,其潜在的加速比也是如此。Amdahl的贡献具有开创性,同时也非常重要,它让其他从事并行处理的学者更清晰地说明为什么大规模系统对我们的未来至关重要,以及它们如何从根本上提供了超越Amdahl预测的加速比的机会。在二十一世纪初,与Amdahl极为相似地,我们预测了由于功率的限制而导致的大规模系统的严峻后果。尽管我们早期的研究经常被忽视,部分学者仍清楚地意识到:功率终将限制性能。在本文的回顾中,我们讨论了大规模功率-性能测量和建模是如何在长达10多年里推动了服务器和超级计算机的设计。这些技术带来的影响在当前是毫无争议的,我们讨论了它们之间的联系、局限性,以及业界继续获得性能提升所必需的其它研究方向。
<table class="reference-tab" style="background-color:#FFFFFF;width:914.104px;color:#333333;font-family:Calibri, Arial, 微软雅黑, "font-size:16px;"> <tbody> <tr class="document-box" id="b1"> <td valign="top" class="td1"> [1] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Amdahl G M. Validity of the single processor approach to achieving large scale computing capabilities. In <i>Proc</i>. <i>the </i><i>Spring Joint Computer Conference</i>, Apr. 1967, pp.483–485. </div> </td> </tr> <tr class="document-box" id="b2"> <td valign="top" class="td1"> [2] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Russell R M. The CRAY-1 computer system. <i>Communications of the ACM</i>, 1978, 21(1): 63–72. DOI: <a class="mainColor ref-doi" href="http://dx.doi.org/10.1145/359327.359336" target="_blank">10.1145/359327.359336</a>. </div> </td> </tr> <tr class="document-box" id="b3"> <td valign="top" class="td1"> [3] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Robbins K A, Robbins S. The Cray X-MP/Model 24: A Case Study in Pipelined Architecture and Vector Processing. Springer, 1989. DOI: <a href="https://doi.org/10.1007/BFb0040661" target="_blank">10.1007/BFb0040661</a>. </div> </td> </tr> <tr class="document-box" id="b4"> <td valign="top" class="td1"> [4] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Gustafson J L. Reevaluating Amdahl’s law. <i>Communications of the ACM</i>, 1988, 31(5): 532–533. DOI: <a class="mainColor ref-doi" href="http://dx.doi.org/10.1145/42411.42415" target="_blank">10.1145/42411.42415</a>. </div> </td> </tr> <tr class="document-box" id="b5"> <td valign="top" class="td1"> [5] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Sun X H, Ni L M. Scalable problems and memory-bounded speedup. <i>Journal of Parallel and Distributed Computing</i>, 1993, 19(1): 27–37. DOI: <a class="mainColor ref-doi" href="http://dx.doi.org/10.1006/jpdc.1993.1087" target="_blank">10.1006/jpdc.1993.1087</a>. </div> </td> </tr> <tr class="document-box" id="b6"> <td valign="top" class="td1"> [6] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Cameron K W, Ge R. Predicting and evaluating distributed communication performance. In <i>Proc</i>. <i>the 2004 ACM/IEEE Conference on Supercomputing</i>, Nov. 2004, pp.43. DOI: <a href="https://doi.org/10.1109/SC.2004.40" target="_blank">10.1109/SC.2004.40</a>. </div> </td> </tr> <tr class="document-box" id="b7"> <td valign="top" class="td1"> [7] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Cameron K W, Ge R, Feng X Z. High-performance, power-aware distributed computing for scientific applications. <i>Computer</i>, 2005, 38(11): 40–47. DOI: <a class="mainColor ref-doi" href="http://dx.doi.org/10.1109/MC.2005.380" target="_blank">10.1109/MC.2005.380</a>. </div> </td> </tr> <tr class="document-box" id="b8"> <td valign="top" class="td1"> [8] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Feng X, Ge R, Cameron K W. Power and energy profiling of scientific applications on distributed systems. In <i>Proc</i>. <i>the 19th IEEE International Parallel and Distributed Processing Symposium</i>, Apr. 2005, p.10. DOI: <a href="https://doi.org/10.1109/IPDPS.2005.346" target="_blank">10.1109/IPDPS.2005.346</a>. </div> </td> </tr> <tr class="document-box" id="b9"> <td valign="top" class="td1"> [9] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Ge R, Feng X, Cameron K W. Improvement of power-performance efficiency for high-end computing. In <i>Proc</i>. <i>the 19th IEEE International Parallel and Distributed Processing Symposium</i>, Apr. 2005, p.8. DOI: <a href="https://doi.org/10.1109/IPDPS.2005.251" target="_blank">10.1109/IPDPS.2005.251</a>. </div> </td> </tr> <tr class="document-box" id="b10"> <td valign="top" class="td1"> [10] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Ge R, Feng X Z, Cameron K W. Performance-constrained distributed DVS scheduling for scientific applications on power-aware clusters. In <i>Proc</i>. <i>the 2005 ACM/IEEE Conference on Supercomputing</i>, Nov. 2005, p.34. DOI: <a href="https://doi.org/10.1109/SC.2005.57" target="_blank">10.1109/SC.2005.57</a>. </div> </td> </tr> <tr class="document-box" id="b11"> <td valign="top" class="td1"> [11] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Ge R, Feng X Z, Song S W, Chang H C, Li D, Cameron K W. PowerPack: Energy profiling and analysis of high-performance systems and applications. <i>IEEE Trans. Parallel and Distributed Systems</i>, 2010, 21(5): 658–671. DOI: <a class="mainColor ref-doi" href="http://dx.doi.org/10.1109/TPDS.2009.76" target="_blank">10.1109/TPDS.2009.76</a>. </div> </td> </tr> <tr class="document-box" id="b12"> <td valign="top" class="td1"> [12] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Intel. Intel<span style="line-height:inherit;vertical-align:baseline;">®</span>64 and IA-32 architectures software developer manuals volume 3A: System programming guide, part 1. 2006. <a href="https://www.intel.cn/content/www/cn/zh/developer/articles/technical/intel-sdm.html" target="_blank">https://www.intel.cn/content/www/cn/zh/developer/articles/technical/intel-sdm.html</a>, Dec. 2022. </div> </td> </tr> <tr class="document-box" id="b13"> <td valign="top" class="td1"> [13] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Lohr S. Demand for data puts engineers in spotlight. <i>New York Times</i>, June 17, 2008. </div> </td> </tr> <tr class="document-box" id="b14"> <td valign="top" class="td1"> [14] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Dongarra J, Ltaief H, Luszczek P, Weaver V M. Energy footprint of advanced dense numerical linear algebra using tile algorithms on multicore architectures. In <i>Proc. the 2nd International Conference on Cloud and Green Computing</i>, Nov. 2012, pp.274-281. DOI: <a href="https://doi.org/10.1109/CGC.2012.113" target="_blank">10.1109/CGC.2012.113</a>. </div> </td> </tr> <tr class="document-box" id="b15"> <td valign="top" class="td1"> [15] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Lange K D. Identifying shades of green: The SPECpower benchmarks. <i>Computer</i>, 2009, 42(3): 95–97. DOI: <a class="mainColor ref-doi" href="http://dx.doi.org/10.1109/MC.2009.84" target="_blank">10.1109/MC.2009.84</a>. </div> </td> </tr> <tr class="document-box" id="b16"> <td valign="top" class="td1"> [16] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Lange K D, Tricker M G. The design and development of the server efficiency rating tool (SERT). In <i>Proc</i>. <i>the 2nd ACM/SPEC International Conference on Performance Engineering</i>, Mar. 2011, pp.145-150. DOI: <a href="https://doi.org/10.1145/1958746.1958769" target="_blank">10.1145/1958746.1958769</a>. </div> </td> </tr> <tr class="document-box" id="b17"> <td valign="top" class="td1"> [17] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Feng W C, Cameron K. The Green500 List: Encouraging sustainable supercomputing. <i>Computer</i>, 2007, 40(12): 50–55. DOI: <a class="mainColor ref-doi" href="http://dx.doi.org/10.1109/MC.2007.445" target="_blank">10.1109/MC.2007.445</a>. </div> </td> </tr> <tr class="document-box" id="b18"> <td valign="top" class="td1"> [18] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Hwang K. Advanced Computer Architecture: Parallelism, Scalability, Programmability. McGraw-Hill Science/Engineering/Math, 1992. </div> </td> </tr> <tr class="document-box" id="b19"> <td valign="top" class="td1"> [19] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Ge R, Cameron K W. Power-aware speedup. In <i>Proc</i>. <i>the 2007 IEEE International Parallel and Distributed Processing Symposium</i>, Mar. 2007. DOI: <a href="https://doi.org/10.1109/IPDPS.2007.370246" target="_blank">10.1109/IPDPS.2007.370246</a>. </div> </td> </tr> <tr class="document-box" id="b20"> <td valign="top" class="td1"> [20] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Grama A Y, Gupta A, Kumar V. Isoefficiency: Measuring the scalability of parallel algorithms and architectures. <i>IEEE Parallel & Distributed Technology: Systems & Applications</i>, 1993, 1(3): 12–21. DOI: <a class="mainColor ref-doi" href="http://dx.doi.org/10.1109/88.242438" target="_blank">10.1109/88.242438</a>. </div> </td> </tr> <tr class="document-box" id="b21"> <td valign="top" class="td1"> [21] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Song S W, Su C Y, Ge R, Vishnu A, Cameron K W. Iso-energy-efficiency: An approach to power-constrained parallel computation. In <i>Proc</i>. <i>the 2011 IEEE International Parallel & Distributed Processing Symposium</i>, May 2011, pp.128-139. DOI: <a href="https://doi.org/10.1109/IPDPS.2011.22" target="_blank">10.1109/IPDPS.2011.22</a>. </div> </td> </tr> <tr class="document-box" id="b22"> <td valign="top" class="td1"> [22] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Deng Q Y, Meisner D, Bhattacharjee A, Wenisch T F, Bianchini R. CoScale: Coordinating CPU and memory system DVFS in server systems. In <i>Proc</i>. <i>the 45th Annual IEEE/ACM International Symposium on Microarchitecture</i>, Dec. 2012, pp.143-154. DOI: <a href="https://doi.org/10.1109/MICRO.2012.22" target="_blank">10.1109/MICRO.2012.22</a>. </div> </td> </tr> <tr class="document-box" id="b23"> <td valign="top" class="td1"> [23] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Eyerman S, Eeckhout L. A counter architecture for online DVFS profitability estimation. <i>IEEE Trans. Computers</i>, 2010, 59(11): 1576–1583. DOI: <a class="mainColor ref-doi" href="http://dx.doi.org/10.1109/TC.2010.65" target="_blank">10.1109/TC.2010.65</a>. </div> </td> </tr> <tr class="document-box" id="b24"> <td valign="top" class="td1"> [24] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Keramidas G, Spiliopoulos V, Kaxiras S. Interval-based models for run-time DVFS orchestration in superscalar processors. In <i>Proc</i>. <i>the 7th ACM International Conference on Computing Frontiers</i>, May 2010, pp.287-296. DOI: <a href="https://doi.org/10.1145/1787275.1787338" target="_blank">10.1145/1787275.1787338</a>. </div> </td> </tr> <tr class="document-box" id="b25"> <td valign="top" class="td1"> [25] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Rountree B, Lowenthal D K, Schulz M, De Supinski B R. Practical performance prediction under dynamic voltage frequency scaling. In <i>Proc</i>. <i>the 2011 International Green Computing Conference and Workshops</i>, Jul. 2011. DOI: <a href="https://doi.org/10.1109/IGCC.2011.6008553" target="_blank">10.1109/IGCC.2011.6008553</a>. </div> </td> </tr> <tr class="document-box" id="b26"> <td valign="top" class="td1"> [26] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> Li B, León E A, Cameron K W. COS: A parallel performance model for dynamic variations in processor speed, memory speed, and thread concurrency. In <i>Proc</i>. <i>the 26th International Symposium on High-Performance Parallel and Distributed Computing</i>, Jun. 2017, pp.155-166. DOI: <a href="https://doi.org/10.1145/3078597.3078601" target="_blank">10.1145/3078597.3078601</a>. </div> </td> </tr> <tr class="document-box" id="b27"> <td valign="top" class="td1"> [27] </td> <td class="td2"> <div class="reference-en" style="margin:0px;padding:0px;"> David H, Gorbatov E, Hanebutte U R, Khanna R, Le C. RAPL: Memory power estimation and capping. In <i>Proc</i>. <i>the 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design</i>, Aug. 2010, pp.189-194. DOI: <a href="https://doi.org/10.1145/1840845.1840883" target="_blank">10.1145/1840845.1840883</a>. </div> </td> </tr> </tbody> </table> |
[1] | . 内存制约加速比模型及其对计算的影响[J]. 计算机科学技术学报, 2023, 38(1): 64-79. |
[2] | Songjie Niu, Shimin Chen. TransGPerf:利用迁移学习建模分布式图计算性能[J]. 计算机科学技术学报, 2021, 36(4): 778-791. |
[3] | Jason Liu, Pedro Espina, Xian-He Sun. 关于储存系统建模和优化的综述[J]. 计算机科学技术学报, 2021, 36(1): 71-89. |
[4] | Quan Zhou, Liang Yang, Hui Cao. 应用于实时图像匹配的互相关可重构计算电路[J]. , 2017, 32(6): 1305-1318. |
[5] | Wei-Qing, Liu Jing Li. 面向仅有稀疏数据的移动云应用的一种自动化性能预测方法[J]. , 2017, 32(5): 936-956. |
[6] | Mei-Rong Li, Yin-Liang Zhao, You Tao, Qi-Ming Wang . 一种基于循环级并行的静态贪心和动态自适应线程激发方法[J]. , 2014, 29(6): 962-975. |
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |