We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Zi-Chao Xie, Dong Tong, Ming-Kai Huang, Qin-Qing Shi, Xu Cheng. SWIP Prediction: Complexity-Effective Indirect-Branch Prediction Using Pointers[J]. Journal of Computer Science and Technology, 2012, 27(4): 754-768. DOI: 10.1007/s11390-012-1262-8
Citation: Zi-Chao Xie, Dong Tong, Ming-Kai Huang, Qin-Qing Shi, Xu Cheng. SWIP Prediction: Complexity-Effective Indirect-Branch Prediction Using Pointers[J]. Journal of Computer Science and Technology, 2012, 27(4): 754-768. DOI: 10.1007/s11390-012-1262-8

SWIP Prediction: Complexity-Effective Indirect-Branch Prediction Using Pointers

Funds: This work is supported by the “HGJ” National Science and Technology Major Project of China under Grant No. 2009ZX01029-001-002.
More Information
  • Received Date: January 05, 2011
  • Revised Date: January 17, 2012
  • Published Date: July 04, 2012
  • Predicting indirect-branch targets has become a performance bottleneck for many applications. Previous highperformance indirect-branch predictors usually require significant hardware storage or additional compiler support, which increases the complexity of the processor front-end or the compilers. This paper proposes a complexity-effective indirectbranch prediction mechanism, called the Set-Way Index Pointing (SWIP) prediction. It stores multiple indirect-branch targets in different branch target buffer (BTB) entries, whose set indices and way locations are treated as set-way index pointers. These pointers are stored in the existing branch-direction predictor. SWIP prediction reuses the branch direction predictor to provide such pointers, and then accesses the pointed BTB entries for the predicted indirect-branch target. Our evaluation shows that SWIP prediction could achieve attractive performance improvement without requiring large dedicated storage or additional compiler support. It improves the indirect-branch prediction accuracy by 36.5% compared to that of a commonly-used BTB, resulting in average performance improvement of 18.56%. Its energy consumption is also reduced by 14.34% over that of the baseline.
  • [1]
    Jiménez D A, Lin C. Dynamic branch prediction with perceptrons.In Proc. the 7th HPCA, Jan. 2001, pp.197-206.
    [2]
    McFarling S. Combining branch predictors. Technical ReportTN-36, Western Research Laboratory, June 1993.
    [3]
    Kim H, Mutlu O, Stark J, Patt Y. Wish branches: Combiningconditional branching and predication for adaptive predicatedexecution. In Proc. the 38th MICRO, Nov. 2005, pp.43-54.
    [4]
    Seznec A. Analysis of the O-GEometric history length branchpredictor. In Proc. the 32nd ISCA, Jun. 2005, pp.394-405.
    [5]
    Yeh T Y, Patt Y. A comparison of dynamic branch predictorsthat use two levels of branch history. In Proc. the 20th ISCA,May 1993, pp.257-266.
    [6]
    Calder B, Grunwald D, Zorn B. Quantifying behavioral differencesbetween C and C++ programs. Journal of ProgrammingLanguages, 1994, 2(4): 323-351.
    [7]
    Chang P Y, Hao E, Patt Y. Target prediction for indirectjumps. In Proc. the 24th ISCA, June 1997, pp.274-283.
    [8]
    Driesen K, Hölzle U. The cascaded predictor: Economical andadaptive branch target prediction. In Proc. the 31st MICRO,Nov. 30-Dec. 2 1998, pp.249-258.
    [9]
    Kim H, Joao J A, Mutlu O, Lee C J, Patt Y, Cohn R.VPC prediction: Reducing the cost of indirect branches viahardware-based dynamic devirtualization. In Proc. the 34thISCA, June 2007, pp.424-435.
    [10]
    Driesen K, Hözle U. Accurate indirect branch prediction.Technical Report TRCS97-19, University of California, March1998.
    [11]
    Kalamatianos J, Kaeli D R. Predicting indirect branches viadata compression. In Proc. the 31st MICRO, Nov. 30-Dec. 2,1998, pp.272-281.
    [12]
    Driesen K, Hölzle U. Multi-stage cascaded prediction. InProc. the 5th Euro-Par Conference on Parallel Processing,Aug. 31-Sept. 3, 1999, pp.1312-1321.
    [13]
    Seznec A, Michaud P. A case for (partially) TAgged GEometrichistory length branch prediction. Journal of Instruction-Level Parallelism (JILP), 2006, 8(1): 1-23.
    [14]
    Joao J A, Mutlu O, Kim H, Agarwal R, Patt Y. Improvingthe performance of object-oriented languages with dynamicpredication of indirect jumps. In Proc. the 13th ASPLOS,March 2008, pp.80-90.
    [15]
    Farooq M, Chen L, John L K. Value based BTB indexingfor indirect jump prediction. In Proc. the 16th HPCA, Jan.2010.
    [16]
    Azizi O, Mahesri A, Lee B C, Patel S J, Horowitz M. Energyperformancetradeoffs in processor architecture and circuit design:A marginal cost analysis. In Proc. the 37th ISCA, Jun.2010, pp.26-36.
    [17]
    Hameed R, Qadeer W, Wachs M et al. Understanding sourcesof inefficiency in general-purpose chips. In Proc. the 37thISCA, June 2010, pp.37-47.
    [18]
    Lin Y L. Essential Issues in System-On-a-Chip Design,Springer-Verlag, 2006.
    [19]
    Lee J K F, Smith A J. Branch prediction strategies and branchtarget buffer design. IEEE Computer, 1984, 17(1): 6-22.
    [20]
    Yeh T Y, Patt Y. Two-level adaptive training branch prediction.In Proc. the 24th MICRO, Nov. 1991, pp.51-61.
    [21]
    Roth A, Moshovos A, Sohi G S. Improving virtual functioncall target prediction via dependence-based pre-computation.In Proc. the 13th ICS, June 1999, pp.356-364.
    [22]
    Gochman S, Ronen R, Anati I et al. The Intel Pentium M processor: Microarchitecture and performance. Intel TechnologyJournal, 2003, 7(2): 21-59.
    [23]
    IBM. IBM PowerPC 970FX RISC Microprocessor user’s manual.Version 2.3, March 2008.
    [24]
    SPEC. Standard Performance Evaluation Corporation.http://www.spec.org, July 2011.
    [25]
    Burger D, Austin T M. The simpleScalar tool set, version 2.0.SIGARCH Comput. Archit. News, 1997, 25(3): 13-25.
    [26]
    Brooks D, Tiwari V, Martonosi M. Wattch: A framework forarchitectural-level power analysis and optimizations. In Proc.the 27th ISCA, June 2000, pp.83-94.
    [27]
    Wolczko M. Benchmarking Java with the Richards benchmark.http://research.sun.com/people/mario/java benchmarking/richards/richards.html, July 2011.
    [28]
    Perelman E, Hamerly G, Van Biesbrouck M et al. Using Sim-Point for accurate and efficient simulation. In Proc. SIGMETRICS,June 2003, pp.318-319.
    [29]
    Yeh T Y, Marr D, Patt Y. Increasing the instruction fetch ratevia multiple branch prediction and branch address cache. InProc. the 7th ICS, July 1993, pp.67-76.
    [30]
    Thoziyoor S, Muralimanohar N, Ahn J N, Jouppi N P.CACTI 5.1. Technical Report HPL-2008-20, Hp Labs,2008, http://www.hpl.hp.com/techreports/2008/HPL-2008-20.html.

Catalog

    Article views (21) PDF downloads (1302) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return