Journal of Computer Science and Technology
Quick Search in JCST
 Advanced Search 
      Home | PrePrint | SiteMap | Contact Us | Help
 
Indexed by   SCIE, EI ...
Bimonthly    Since 1986
Journal of Computer Science and Technology 2011, Vol. 26 Issue (3) :418-433    DOI: 10.1007/s11390-011-1144-5
Special Section on High-Performance Computing for Embedded Multi-Core Systems Current Issue | Archive | Adv Search << Previous Articles | Next Articles >>
Energy Efficient Block-Partitioned Multicore Processors for Parallel Applications
Xuan Qi (祁轩) and Da-Kai Zhu (朱大开), Member, IEEE
University of Texas at San Antonio, San Antonio, TX, 78249, U.S.A.

Download: PDF (473KB)   HTML (1KB)   Export: BibTeX or EndNote (RIS)      Supporting Info
Abstract Due to the increasing power consumption in modern computing systems, energy management has become an important research area in the last decade. Recently, multicore has emerged to be an energy efficient architecture that exploits parallelisms in modern applications. However, as the number of cores on a single chip continues to increase, it has been a grand challenge on how to effectively manage the energy efficiency of multicore-based systems. In this paper, based on the voltage island and dynamic voltage and frequency scaling (DVFS) techniques, we investigate the energy efficiency of block-partitioned multicore processors, where cores are grouped into blocks with the cores on one block sharing a DVFSenabled power supply. Depending on the number of cores on each block, we study both symmetric and asymmetric block configurations. We develop a system-level power model (which can support various power management techniques) and derive both block- and system-wide energy-efficient frequencies for systems with block-partitioned multicore processors. Based on the power model, we prove that, for embarrassingly parallel applications, having all cores on a single block can achieve the same energy savings as that of the individual block configuration (where each core forms a single block and has its own power supply). However, for applications with limited degrees of parallelism, we show the superiority of the buddy-asymmetric block configuration, where the number of required blocks (and power supplies) is logarithmically related to the number of cores on the chip, in that it can achieve the same amount of energy savings as that of the individual block configuration. The energy efficiency of different block configurations is further evaluated through extensive simulations with both synthetic as well as a real life application.
Service
Email this article
Add to my bookshelf
Add to citation manager
Email Alert
RSS
Articles by authors
Keywordsmulticore processors,   dynamic voltage and frequency scaling (DVFS),   voltage islands,   parallel applications     
Received: 2010-03-29;
Fund:

This work was supported in part by NSF Awards of USA under Grant Nos. CNS-0855247, CNS-1016974, and NSF CAREER Award of USA under Grant No. CNS-0953005.

Cite this article:   
.Energy Efficient Block-Partitioned Multicore Processors for Parallel Applications[J]  Journal of Computer Science and Technology, 2011,V26(3): 418-433
URL:  
http://jcst.ict.ac.cn:8080/jcst/EN/10.1007/s11390-011-1144-5     或     http://jcst.ict.ac.cn:8080/jcst/EN/Y2011/V26/I3/418
 
[1] Aydin H, Melhem R, Mosse D, Mejia-Alvarez P. Dynamic and aggressive scheduling techniques for power-aware realtime systems. In Proc. the 22nd IEEE Real-Time Systems Symposium, London, UK, Dec. 3-6, 2001, pp.95-105.
[2] Lebeck A R, Fan X, Zeng H, Ellis C S. Power aware page allocation. In Proc. the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, USA, Nov. 13-15, 2000, pp.105-116.
[3] Pillai P, Shin K G. Real-time dynamic voltage scaling for low-power embedded operating systems. In Proc. 18th ACM Symposium on Operating Systems Principles (SOSP 2001), Banff, Canada, Oct. 21-24, 2001, pp.89-102.
[4] Bohrer P, Elnozahy E N, Keller T, Kistler M, Lefurgy C, McDowell C, Rajamony R. The Case for Power Management in Web Servers, Ch. 1. Power Aware Computing, Plenum/Kluwer Publishers, 2002.
[5] Springer R, Lowenthal D K, Rountree B, Freeh V W. Minimizing execution time in MPI programs on an energyconstrained, power-scalable cluster. In Proc. the 11th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), New York, USA, Mar. 29-31, 2006, pp.230-238.
[6] International technology roadmap for semiconductors. S. R. Corporation, http://public.itrs.net, 2008.
[7] Yao F, Demers A, Shenker S. A scheduling model for reduced CPU energy. In Proc. the 36th Annual Symposium on Foundations of Computer Science, Milwaukee, USA, Oct. 23-25, 1995, pp.374-382.
[8] Burd T D, Brodersen R W. Energy efficient CMOS microprocessor design. In Proc. the HICSS Conference, Wailea, USA, Jan. 3-6, 1995, pp.288-297.
[9] Zhu D, Melhem R, Childers B R. Scheduling with dynamic voltage/speed adjustment using slack reclamation in multiprocessor real-time systems. IEEE Trans. Parallel and Distributed Systems, 2003, 14(7): 686-700.
[10] Li X, Li Z, David F, Zhou P, Zhou Y, Adve S, Kumar S. Performance directed energy management for main memory and disks. In Proc. the 11th Int. Conference on Architectural Support for Programming Languages and Operating Systems, Boston, USA, Oct. 9-13, 2004, pp.271-283.
[11] Swaminathan V, Chakrabarty K. Pruning-based, energyoptimal, deterministic I/O device scheduling for hard realtime systems. ACM Transactions on Embedded Computing Systems, Feb. 2005, 4(1): 141-167.
[12] Neau C, Roy K. Optimal body bias selection for leakage improvement and process compensation over different technology generations. In Proc. the International Symposium on Low Power Electronics and Design (ISLPED), Seoul, Korea, Aug. 25-27, 2003, pp.116-121.
[13] Aydin H, Devadas V, Zhu D. System-level energy management for periodic real-time tasks. In Proc. the 27th IEEE Real-Time Systems Symposium (RTSS), Rio de Janeiro, Brazil, Dec. 5-8, 2006, pp.313-322.
[14] Choi K, Lee W, Soma R, Pedram M. Dynamic voltage and frequency scaling under a precise energy model considering variable and fixed components of the system power dissipation. In Proc. International Conference on Computer Aided Design (ICCAD), San Jose, USA, Nov. 7-11, 2004, pp.29-34.
[15] Jejurikar R, Gupta R. Dynamic voltage scaling for system wide energy minimization in real-time embedded systems. In Proc. the Int. Symposium on Low Power Electronics and Design (ISLPED), New Port Beach, USA, Aug. 9-11, 2004, pp.78-81.
[16] Olukotun K, Nayfeh B A, Hammond L, Wilson K, Chang K. The case for a single-chip multiprocessor. In Proc. the Int. Symp. Architectural Support for Programming Languages and Operating Systems (ASPLOS), Massachusetts, USA, Oct. 1-5, 1996, pp.2-11.
[17] http://www.intel.com/products/processor/core2quad/, 2008.
[18] http://www.amd.com/us/products/desktop/processors/, 2008.
[19] Shin J, Tam K, Huang D, Petrick B, Pham H, Hwang C, Li H, Smith A, Johnson T, Schumacher F, Greenhill D, Leon A, Strong A. A 40 nm 16-core 128-thread CMT SPARC SoC processor. In Proc. the International Solid-State Circuits Conference (ISSCC), San Francisco, USA, Feb. 7-11, 2010, pp.98-99.
[20] Bautista D, Sahuquillo J, Hassan H, Petit S, Duato J. A simple power-aware scheduling for multicore systems when running real-time applications. In Proc. IEEE International Parallel and Distributed Processing Symposium (IPDPS), Miami, USA, Apr. 14-18, 2008, pp.1-7.
[21] Devadas V, Aydin H. Coordinated power management of periodic real-time tasks on chip multiprocessors. In Proc. the First IEEE Int. Green Computing Conference (IGCC), Chicago, USA, Aug. 15-18, 2010, pp.61-72.
[22] Iyer A, Marculescu D. Power efficiency of voltage scaling in multiple clock, multiple voltage cores. In Proc. the International Conference on Computer-Aided Design (ICCAD), San Jose, USA, Nov. 10-14, 2002, pp.379-386.
[23] Kim H, Hong H, Kim H S, Ahn J H, Kang S. Total energy minimization of real-time tasks in an on-chip multiprocessor using dynamic voltage scaling efficiency metric. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Nov. 2008, 27(11): 2088-2092.
[24] Seo E, Jeong J, Park S, Lee J. Energy efficient scheduling of real-time tasks on multicore processors. IEEE Trans. Parallel Distrib. Syst., 2008, 19(11): 1540-1552.
[25] Intel embedded quad-core xeon. Intel, http://www.intel.com/products/embedded/, 2009.
[26] Sinkar A, Kim N S. Analyzing potential power reduction with adaptive voltage positioning optimized for multicore processors. In Proc. the 14th ACM/IEEE Int. Symposium on Low Power Electronics and Design, San Francisco, USA, Aug. 19- 21, 2009, pp.189-194.
[27] Dorsey J, Searles S, Ciraula M, Johnson S, Bujanos N, Wu D, Braganza M, Meyers S, Fang E, Kumar R. An integrated quad-core Opteron processor. In Proc. IEEE International Solid-State Circuits Conference, San Francisco, USA, Feb. 11- 15, 2007, pp.102-103.
[28] Cohn J, Stout D, Zuchowski P, Gould S, Bednar T, Lackey D. Managing power and performance for system-on-chip designs using voltage islands. In Proc. the International Conference on Computer-Aided Design (ICCAD), San Jose, USA, Nov. 10-14, 2002, pp.195-202.
[29] Hu J, Shin Y, Dhanwaday N, Marculescu R. Architecting voltage islands in core-based system-on-a-chip designs. In Proc. the Int. Symposium on Low Power Electronics and Design (ISLPED), New Port Beach, USA, Aug. 9-11, 2004, pp.180- 185.
[30] Leung L, Tsui C. Energy-aware synthesis of networks-onchip implemented with voltage islands. In Proc. the 44th ACM/IEEE Design Automation Conference (DAC), San Diego, USA, Jun. 4-8, 2007, pp.128-131.
[31] Kim W, Gupta M, Wei G Y, Brooks D. System level analysis of fast, per-core DVFS using on-chip switching regulators. In Proc. the Int. Symp. High-Performance Computer Architecture (HPCA), Salt Lake City, USA, Feb. 16-20, 2008, pp.123-134.
[32] Hazucha P, Schrom G, Hahn J, Bloechel B A, Hack P, Dermer G E, Narendra S, Gardner D, Karnik T, De V, Borkar S. A 233-MHz 80%-87% efficient four-phase DC-DC converter utilizing air-core inductors on package. IEEE Journal of SolidState Circuits, 2005, 40(4): 838-845.
[33] Borkar S. Thousand core chips: A technology perspective. In Proc. the 44th Annual Design Automation Conference (DAC), San Diego, USA, Jun. 4-8, 2007, pp.746-749.
[34] Hill M D, Marty M R. Amdahl’s law in the multicore era. IEEE Computer, 2008, 41(7): 33-38.
[35] Mosley L. Power delivery challenges for multicore processors. In Proc. the CARTS USA, 2008.
[36] Ishihara T, Yauura H. Voltage scheduling problem for dynamically variable voltage processors. In Proc. The International Symposium on Low Power Electronics and Design, Monterey, USA, Aug. 10-12, 1998, pp.197-202.
[37] Aydin H, Yang Q. Energy-aware partitioning for multiprocessor real-time systems. In Proc. the 17th International Parallel and Distributed Processing Symposium (IPDPS)/Workshop on Parallel and Distributed Real-Time Systems (WPDRTS), Nice, France, Apr. 22-26, 2003.
[38] AlEnawy T A, Aydin H. Energy-aware task allocation for rate monotonic scheduling. In Proc. the 11th IEEE Real Time on Embedded Technology and Applications Symposium (RTAS 2005), San Francisco, USA, Apr. 13-16, 2005, pp.213- 223.
[39] Anderson J H, Baruah S K. Energy-efficient synthesis of periodic task systems upon identical multiprocessor platforms. In Proc.the 24th International Conference on Distributed Computing Systems (ICDCS 2004), Tokyo, Japan, Mar. 24-26, 2004, pp.428-435.
[40] Chen J J, Hsu H R, Chuang K H, Yang C L, Pang A C, Kuo T W. Multiprocessor energy-efficient scheduling with task migration considerations. In Proc. the 16th Euromicro Conference on Real-Time Systems (ECRTS 2004), Catania, Italy, Jun. 30-Jul. 2, 2004, pp.101-108.
[41] Chen J J, Hsu H R, Kuo T W. Leakage-aware energy-efficient scheduling of real-time tasks in multiprocessor systems. In Proc. the 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2006), San Jose, USA, Apr. 4-7, 2006, pp.408-417.
[42] Chen J J. Multiprocessor energy-efficient scheduling for realtime tasks with different power characteristics. In Proc. the 2005 International Conference on Parallel Processing, Oslo, Norway, Jun. 14-17, 2005, pp.13-20.
[43] Yang C Y, Chen J J, Kuo T W. An approximation algorithm for energy-efficient scheduling on a chip multiprocessor. In Proc. the Conference on Design, Automation and Test in Europe, Munich, Germany, Feb. 16-20, 2005, pp.468-473.
[44] Cho S, Melhem R G. On the interplay of parallelization, program performance, and energy consumption. IEEE Transactions on Parallel and Distributed Systems, 2010, 21(3): 342- 353.
[45] Donald J, Martonosi M. Techniques for multicore thermal management: Classification and new exploration. In Proc. the 33rd International Symposium on Computer Architecture (ISCA2006), Boston, USA, Jun. 17-21, 2006, pp.78-88.
[46] Vangal S, Howard J, Ruhl G, Dighe S, Wilson H, Tschanz J, Finan D, Iyer P, Singh A, Jacob T, Jain S, Venkataraman S, Hoskote Y, Borkar N. An 80-tile 1.28 TFlops network-onchip in 65 nm CMOS. In Proc. ISSCC, San Francisco, USA, Feb. 11-15, 2007, pp.98-99.
[47] Herbert S, Marculescu D. Analysis of dynamic voltage/ frequency scaling in chip-multiprocessors. In Proc. the International Symposium on Low Power Electronics and Design (ISLPED), Portland, USA, Aug. 27-29, 2007, pp.38-43.
[48] Isci C, Buyuktosunoglu A, Cher C Y, Bose P, Martonosi M. An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget. In Proc. the 39th Annual IEEE/ACM Int. Symposium on Microarchitecture (MICRO), Orlando, USA, Dec. 9-13, 2006, pp.347-358.
[49] Kumar R, Hinton G. A family of 45 nm IA processors. In Proc. the Int. Solid-State Circuits Conference, San Francisco, USA, Feb. 8-12, 2009, pp.58-59.
[50] Silberschatz A, Galvin P B, Gagne G. Operating System Concepts. Wiley, 2002.
[51] Saewong S, Rajkumar R. Practical voltage scaling for fixedpriority RT-systems. In Proc. the 9th IEEE Real-Time and Embedded Technology and Applications Symposium, San Francisco, USA, Apr. 13-16, 2003, pp.106-115.
[52] Yun H, Wu P L, Arya A, Abdelzaher T, Kim C, Sha L. System-wide energy optimization for multiple DVS components and real-time tasks. In Proc. the 22nd Euromicro Conference on Real-Time Systems (ECRTS), Dresden, Germany, Jul. 5-7, 2010, pp.133-142.
[53] Zhu D, Melhem R, Moss′e D. The effects of energy management on reliability in real-time embedded systems. In Proc. the Int. Conf. Computer Aided Design, San Jose, USA, Nov. 7-11, 2004, pp.35-40.
[54] Fan X, Ellis C, Lebeck A. The synergy between power-aware memory systems and processor voltage. In Proc. the Workshop on Power-Aware Computing Systems, San Diego, USA, Dec. 1, 2003, pp.151-166.
[55] Foster I. Design and Building Parallel Programs, Chapter 1.4.4. Addison-Wesley, 1995.
[56] Ratches J A, Walters C P, Buser R G, Guenther B D. Aided and automatic target recognition based upon sensory inputs from image forming systems. IEEE Trans. Pattern Analysis and Machine Intelligence, 1997, 19(9): 1004-1019.
No similar paper found!
Copyright 2010 by Journal of Computer Science and Technology