›› 2016, Vol. 31 ›› Issue (6): 1179-1193.doi: 10.1007/s11390-016-1691-x

Special Issue: Computer Architecture and Systems

• Regular Paper • Previous Articles     Next Articles

Reducing the Upper Bound Delay by Optimizing Bank-to-Core Mapping

Ji-Zan Zhang1,2, Zhi-Min Gu1, Member, CCF, Ming-Quan Zhang1, Member, CCF   

  1. 1 School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China;
    2 School of Mathematics and Information, Ludong University, Yantai 264025, China
  • Received:2015-07-23 Revised:2016-07-26 Online:2016-11-05 Published:2016-11-05
  • Supported by:

    This work is supported by the National Natural Science Foundation of China under Grant No. 61370062.

Nowadays,inter-task interferences are the main difficulty in analyzing the timing behavior of multicores.The timing predictable embedded multicore architecture MERASA,which allows safe worst-case execution time (WCET) estimations,has emerged as an attractive solution.In the architecture,WCET can be estimated by the upper bound delay (UBD) which can be bounded by the interference-aware bus arbiter (IABA) and the dynamic cache partitioning such as columnization or bankization.However,this architecture faces a dilemma between decreasing UBD and efficient shared cache utilization.To obtain tighter WCET estimation,we propose a novel approach that reduces UBD by optimizing bank-to-core mapping on the multicore system with IABA and the two-level partitioned cache.For this,we first present a new UBD computation model based on the analysis of inter-task interference delay,and then put forward the core-sequence optimization method of bank-to-core mapping and the optimizing algorithms with the minimum UBD.Experimental results demonstrate that our approach can reduce WCET from 4% to 37%.

[1] Cullmann C, Ferdinand C, Gebhard G, Grund D, Maiza C, Reineke J, Triquet B, Wilhelm R. Predictability considerations in the design of multi-core embedded systems. In Proc. the Embedded Real Time Software and Systems, May 2010, pp.36-42.

[2] Suhendra V, Mitra T. Exploring locking & partitioning for predictable shared caches on multi-cores. In Proc. the 45th Annual Design Automation Conference, June 2008, pp.300-303.

[3] Fedorova A, Blagodurov S, Zhuravlev S.Managing contention for shared resources on multicore processors. Communications of the ACM, 2010, 53(2):49-57.

[4] Zhuravlev S, Blagodurov S, Fedorova A. Addressing shared resource contention in multicore processors via scheduling. In Proc. the 15th Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems, March 2010, pp.129-142.

[5] Kandemir M, Muralidhara S P, Narayanan S H K, Zhang Y R, Ozturk O. Optimizing shared cache behavior of chip multiprocessors. In Proc. the 42nd Annual IEEE/ACM Int. Symp. Microarchitecture, December 2009, pp.505-516.

[6] Rosén J, Andrei A, Eles P, Peng Z B. Bus access optimization for predictable implementation of real-time applications on multiprocessor systems-on-chip. In Proc. the 28th IEEE Int. Real-Time Systems Symp., December 2007, pp.49-60.

[7] Chattopadhyay S, Roychoudhury A, Mitra T. Modeling shared cache and bus in multi-cores for timing analysis. In Proc. the 13th Int. Workshop on Software and Compilers for Embedded Systems, June 2010, Article No. 6.

[8] Rosén J, Neikter C F, Eles P, Peng Z B, Burgio P, Benini L. Bus access design for combined worst and average case execution time optimization of predictable real-time applications on multiprocessor systems-on-chip. In Proc. the 17th IEEE Real-Time and Embedded Technology and Applications Symp., April 2011, pp.291-301.

[9] Guan N, Stigge M, Yi W, Yu G. Cache-aware scheduling and analysis for multicores. In Proc. the 7th ACM Int. Conference on Embedded Software, October 2009, pp.245-254.

[10] Paolieri M, Quiñones E, Cazorla F J, Bernat G, Valero M. Hardware support for WCET analysis of hard real-time multicore systems. In Proc. the 36th Annual Int. Symp. Computer Architecture, June 2009, pp.57-68.

[11] Yoon M K, Kim J E, Sha L. Optimizing tunable WCET with shared resource allocation and arbitration in hard realtime multicore systems. In Proc. the 32nd IEEE Real-Time Systems Symp., Nov. 29-Dec. 2, 2011, pp.227-238.

[12] Ding H P, Liang Y, Mitra T. WCET-centric dynamic instruction cache locking. In Proc. the Conference on Design, Automation & Test in Europe, March 2014, Article No. 27.

[13] Liu T T, Li M M, Xue C J. Instruction cache locking for multi-task real-time embedded systems. Real-Time Syst., 2012, 48(2):166-197.

[14] Kaseridis D, Stuecheli J, John L K. Bank-aware dynamic cache partitioning for multicore architectures. In Proc. the 2009 Int. Conference on Parallel Processing, September 2009, pp.18-25.

[15] Liu T T, Zhao Y C, Li M M, Xue C J. Joint task assignment and cache partitioning with cache locking for WCET minimization on MPSoC. Journal of Parallel and Distributed Computing, 2011, 71(11):1473-1483.

[16] Kelter T, Falk H, Marwedel P, Chattopadhyay S, Roychoudhury A. Bus-aware multicore WCET analysis through TDMA offset bounds. In Proc. the 23rd Euromicro Conference on Real-Time Systems, July 2011, pp.3-12.

[17] Chattopadhyay S, Chong L K, Roychoudhury A, Kelter T, Marwedel P, Falk H. A unified WCET analysis framework for multi-core platforms. ACM Trans. Embed. Comput. Syst., 2014, 13(4S):Article No. 124.

[18] Axer P, Ernst R, Falk H, Girault A, Grund D, Guan N, Jonsson B, Marwedel P, Reineke J, Rochange C, Sebastian M, Von Hanxleden R, Wilhelm R, Yi W. Building timing predictable embedded systems. ACM Trans. Embed. Comput. Syst., 2014, 13(4):Article No. 82.

[19] Kelter T, Harde T, Marwedel P, Falk H. Evaluation of resource arbitration methods for multi-core real-time systems. In Proc. the 13th Int. Workshop on Worst-Case Execution Time Analysis, July 2013.

[20] Jalle J, Abella J, Quiñones E, Fossati L, Zulianello M, Cazorla F J. Deconstructing bus access control policies for Real-Time multicores. In Proc. the 8th IEEE Int. Symp. Industrial Embedded Systems, June 2013, pp.31-38.

[21] Schoeberl M. Is time predictability quantifiable? In Proc. Int. Conference on Embedded Computer Systems, July 2012, pp.333-338.

[22] Schoeberl M. Time-predictable computer architecture. EURASIP Journal on Embedded Systems, 2009, 2009:758480.

[23] Ungerer T, Cazorla F, Sainrat P, Bernat G, Petrov Z, Rochange C, Quinones E, Gerdes M, Paolieri M, Wolf J, Casse H, Uhrig S, Guliashvili I, Houston M, Kluge F, Metzlaff S, Mische J. Merasa:Multicore execution of hard real-time applications supporting analyzability. IEEE Micro, 2010, 30(5):66-75.

[24] Paolieri M, Mische J, Metzlaff S, Gerdes M, Quiñones E, Uhrig S, Ungerer T, Cazorla F J. A hard real-time capable multi-core SMT processor. ACM Trans. Embed. Comput. Syst., 2013, 12(3):Article No. 79.

[25] Hansson A, Goossens K, Bekooij M, Huisken J. CoMPSoC:A template for composable and predictable multi-processor system on chips. ACM Trans. Des. Autom. Electron. Syst., 2009, 14(1):Article No. 2.

[26] Akesson B, Goossens K, Ringhofer M. Predator:A predictable SDRAM memory controller. In Proc. the 5th IEEE/ACM Int. Conference on Hardware/Software Codesign and System Synthesis, October 2007, pp.251-256.

[27] Li Y, Suhendra V, Liang Y, Mitra T, Roychoudhury A. Timing analysis of concurrent programs running on shared cache multi-cores. In Proc. the 30th IEEE Real-Time Systems Symp., December 2009, pp.57-67.

[28] Li X F, Liang Y, Mitra T, Roychoudhury A. Chronos:A timing analyzer for embedded software. Science of Computer Programming, 2007, 69(1/2/3):56-67.

[29] Gustafsson J, Betts A, Ermedahl A, Lisper B. The Mälardalen WCET benchmarks:Past, present and future. In Proc. the 10th Int. Workshop on Worst-Case Execution Time Analysis, July 2010, pp.136-146.

[30] Nemer F, Cassé H, Sainrat P, Bahsoun J P, De Michiel M. Papabench:A free real-time benchmark. In Proc. the 6th Int. Workshop on Worst-Case Execution Time Analysis, July 2006.
No related articles found!
Full text



[1] Cai Zixing;. An Expert System for Robot Transfer Planning[J]. , 1988, 3(2): 153 -160 .
[2] Ian Foster. Globus Toolkit Version 4: Software for Service-Oriented Systems[J]. , 2006, 21(4): 513 -520 .
[3] Piotr Tomaszewski, Lars Lundberg, and Haa kan Grahn. Improving Fault Detection in Modified Code --- A Study from the Telecommunication Industry[J]. , 2007, 22(3): 397 -409 .
[4] Chaveevan Pechsiri and Asanee Kawtrakul. Mining Causality for Explanation Knowledge from Text[J]. , 2007, 22(6): 877 -889 .
[5] Wei-Wu Hu and Jian Wang. Making Effective Decisions in Computer Architects Real-World: Lessons and Experiences with Godson-2 Processor Designs[J]. , 2008, 23(4 ): 620 -632 .
[6] Xian Xu. Expressing First-Order pi-Calculus in Higher-Order Calculus of Communicating Systems[J]. , 2009, 24(1 ): 122 -137 .
[7] Zhu-Fei Chu (储著飞), Student Member, IEEE, Yin-Shui Xia (夏银水), and Lun-Yao Wang (王伦耀). Cell Mapping for Nanohybrid Circuit Architecture Using Genetic Algorithm[J]. , 2012, 27(1): 113 -120 .
[8] Fa-Ming Li, Xiao-Wu Chen, Bin Zhou, Fei-Xiang, Lu Kan Guo, Qiang Fu. Monocular Video Guided Garment Simulation[J]. , 2015, 30(3): 528 -539 .
[9] Suchakrapani Datt Sharma, Student Member, IEEE, Michel Dagenais, Senior Member, IEEE. Enhanced Userspace and In-Kernel Trace Filtering for Production Systems[J]. , 2016, 31(6): 1161 -1178 .
[10] Shihong Xia, Lin Gao, Yu-Kun Lai, Ming-Ze Yuan, Jinxiang Chai. A Survey on Human Performance Capture and Animation[J]. , 2017, 32(3): 536 -554 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved