|
Journal of Computer Science and Technology ›› 2020, Vol. 35 ›› Issue (1): 92-120.doi: 10.1007/s11390-020-9781-1
Special Issue: Computer Architecture and Systems
• Special Section on Selected I/O Technologies for High-Performance Computing and Data Analytics • Previous Articles Next Articles
Anthony Kougkas, Member, ACM, IEEE, Hariharan Devarajan, Xian-He Sun, Fellow, IEEE
[1] Kitchin R. Big Data, new epistemologies and paradigm shifts. Big Data & Society, 2014, 1(1):Article No. 1. [2] Reinsel D, Gantz J, Rydning J. Data age 2025:The evolution of data to life-critical-Don't focus on big data; focus on the data that's big. https://www.import.io/wpcontent/uploads/2017/04/Seagate-WP-DataAge2025-March-2017.pdf,May 2019. [3] Hey T, Stewart T, Tolle K M. The Fourth Paradigm:DataIntensive Scientific Discovery (1st edition). Microsoft Research, 2009. [4] Thakur R, Gropp W, Lusk E. Data sieving and collective I/O in ROMIO. In Proc. the 7th Symposium on the Frontiers of Massively Parallel Computation, February 1999, pp.182-189. [5] Folk M, Cheng A, Yates K. HDF5:A file format and I/O library for high performance computing applications. In Proc. Supercomputing, November 1999, pp.5-33. [6] Braam P. The Lustre storage architecture. arXiv:1903.01955, 2019. https://arxiv.org/pdf/1903.01955,May 2019. [7] Schmuck F B, Haskin R L. GPFS:A shared-disk file system for large computing clusters. In Proc. the Conference on File and Storage Technologies, January 2002, pp.231-244. [8] Carns P H, Ligon III W B, Ross R B, Thakur R. PVFS:A parallel file system for Linux clusters. In Proc. the 4th Annual Linux Showcase and Conference, October 2000, pp.391-430. [9] Khaleel M A. Scientific Grand Challenges:Crosscutting Technologies for Computing at the Exascale. Pacific Northwest National Laboratory, 2010. http://digital.library.unt.edu/ark:/67531/metadc841613/,Dec.2019. [10] Dongarra J, Beckman P, Moore T et al. The international exascale software project roadmap. International Journal of High Performance Computing Applications, 2011, 25(1):3-60. [11] Reed D A, Dongarra J. Exascale computing and big data. Communications of the ACM, 2015, 58(7):56-68. [12] Shalf J, Dosanjh S, Morrison J. Exascale computing technology challenges. In Proc. the 9th International Conference on High Performance Computing for Computational Science, June 2010, pp.1-25. [13] Bent J, Grider G, Kettering B, Manzanares A, McClelland M, Torres A, Torrez A. Storage challenges at Los Alamos National Lab. In Proc. the 28th IEEE Symposium on Mass Storage Systems and Technologies, April 2012, Article No. 12. [14] Caulfield A M, Grupp L M, Swanson S. Gordon:Using flash memory to build fast, power efficient clusters for data-intensive applications. In Proc. the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, March 2009, pp.217-228. [15] Kannan S, Gavrilovska A, Schwan K, Milojicic D, Talwar V. Using active NVRAM for I/O staging. In Proc. the 2nd International Workshop on Petascal Data Analytics:Challenges and Opportunities, November 2011, pp.15-22. [16] Caulfield A M, Coburn J, Mollov T, De A, Akel A, He J H, Jagatheesan A, Gupta R K, Snavely A, Swanson S. Understanding the impact of emerging non-volatile memories on high-performance, IO-intensive computing. In Proc. the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, November 2010. [17] Lockwood G K, Hazen D, Koziol Q et al. Storage 2020:A vision for the future of HPC storage. Technical Report, Lawrence Berkeley National Laboratory, 2017. https://escholarship.org/uc/item/744479dp,May 2019. [18] Li J W, Liao W K, Choudhary A, Ross R, Thakur R, Gropp W, Latham R, Siegel A, Gallagher B, Zingale M. Parallel netCDF:A high-performance scientific I/O interface. In Proc. the 2003 ACM/IEEE Conference on Supercomputing, Nov. 2003. [19] Lofstead J F, Klasky S, Schwan K, Podhorszki N, Jin C. Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS). In Proc. the 6th International Workshop on Challenges of Large Applications in Distributed Environments, June 2008, pp.15-24. [20] Chang F, Gibson G A. Automatic I/O hint generation through speculative execution. In Proc. the 3rd USENIX Symposium on Operating Systems Design and Implementation, February 1999, pp.1-14. [21] He J, Sun X H, Thakur R. KNOWAC:I/O prefetch via accumulated knowledge. In Proc. the 2012 IEEE International Conference on Cluster Computing, September 2012, pp.429-437. [22] Dong B, Wang T, Tang H J, Koziol Q, Wu K S, Byna S. ARCHIE:Data analysis acceleration with array caching in hierarchical storage. In Proc. the 2018 IEEE International Conference on Big Data, December 2018, pp. 211-220. [23] Buyya R, Calheiros R N, Amir V D. Big Data:Principles and Paradigms (1st edition). Morgan Kaufmann, 2016. [24] Kune R, Konugurthi P, Agarwal A, Rao C R, Buyya R. The anatomy of big data computing. Software-Practice and Experience, 2016, 46(1):79-105. [25] Kougkas A, Devarajan H, Sun X H, Lofstead J F. Harmonia:An interference-aware dynamic I/O scheduler for shared non-volatile burst buffers. In Proc. the 2018 IEEE International Conference on Cluster Computing, September 2018, pp. 290-301. [26] Xie B, Huang Y Z, Chase J S, Choi J Y, Klasky S, Lofstead J, Oral S. Predicting output performance of a petascale supercomputer. In Proc. the 26th International Symposium on High-Performance Parallel and Distributed Computing, June 2017, pp.181-192. [27] Kim Y, Gunasekaran R, Shipman G M, Dillow D A, Zhang Z, Settlemyer B W. Workload characterization of a leadership class storage cluster. In Proc. the 5th Petascale Data Storage Workshop, Nov. 2010, Article No. 4. [28] Mi N F, Riska A, Zhang Q, Smirni E, Riedel E. Efficient management of idleness in storage systems. ACM Transactions on Storage, 2019, (2):Article No. 4. [29] Ahern S, Alam S R, Fahey M R et al. Scientific application requirements for leadership computing at the exascale. Technical Report, Oak Ridge National Laboratory, 2007. https://www.olcf.ornl.gov/wpcontent/uploads/2010/03/Exascale_Reqms.pdf,May 2019. [30] Carns P, Harms K, Allcock W, Bacon C, Lang S, Latham R, Ross R. Understanding and improving computational science storage access through continuous characterization. ACM Transactions on Storage, 2011, 7(3):Article No. 8. [31] Dundas J, Mudge T. Improving data cache performance by pre-executing instructions under a cache miss. In Proc. the 11th International Conference on Supercomputing, July 1997, pp.68-75. [32] Doweck J. Shared memory access. http://download.intel.com/technology/architecture/sma.pdf,May 2019. [33] Mutlu O, Stark J, Wilkerson C, Patt Y N. Runahead execution:An alternative to very large instruction windows for out-of-order processors. In Proc. the 9th International Symposium on High-Performance Computer Architecture, February 2003, pp.129-140. [34] Qadri M Y, Qadri N N, Fleury M, McDonald-Maier K D. Energy-efficient data prefetch buffering for low-end embedded processors. Microelectronics Journal, 2017, 62:57-64. [35] Sun X H, Byna S, Chen Y. Server-based data push architecture for multi-processor environments. Journal of Computer Science and Technology, 2007, 22(5):641-652. [36] Zhou H Y. Dual-core execution:Building a highly scalable single-thread instruction window. In Proc. the 14th International Conference on Parallel Architectures and Compilation Techniques, September 2005, pp.231-242. [37] Cao P, Felten E W, Karlin A R, Li K. Implementation and performance of integrated application-controlled file caching, prefetching, and disk scheduling. ACM Transactions on Computer Systems, 1996, 14(4):311-343. [38] Ding X N, Jiang S, Chen F, Davis K, Zhang X D. DiskSeen:Exploiting disk layout and access history to enhance I/O prefetch. In Proc. the 2017 USENIX Annual Technical Conference, June 2007, pp.261-274. [39] Klaiber A C, Levy H M. An architecture for softwarecontrolled data prefetching. In Proc. the 18th Annual International Symposium on Computer Architecture, May 1991, pp.43-53. [40] Mowry T, Gupta A. Tolerating latency through softwarecontrolled prefetching in shared-memory multiprocessors. Journal of Parallel and Distributed Computing, 1991, 12(2):87-106. [41] Subedi P, Davis P, Duan S H, Klasky S, Kolla H, Parashar M. Stacker:An autonomic data movement engine for extreme-scale data staging-based in-situ workflows. In Proc. the International Conference for High Performance Computing, Networking, November 2018, Article No. 73. [42] Cherubini G, Kim Y, Lantz M, Venkatesan V. Data prefetching for large tiered storage systems. In Proc. the 2017 IEEE International Conference on Data Mining, November 2017, pp.823-828. [43] Joo Y, Park S, Bahn H. Exploiting I/O reordering and I/O interleaving to improve application launch performance. ACM Transactions on Storage, 2017, 13(1):Article No. 8. [44] Abbasi H, Wolf M, Eisenhauer G, Klasky S, Schwan K, Zheng F. DataStager:Scalable data staging services for petascale applications. Cluster Computing, 2010, 13(3):277-290. [45] Bengio Y. Markovian models for sequential data. Neural Computing Surveys, 1999, 2(199):129-162. [46] Thilaganga V, Karthika M, Lakshmi M M. A prefetching technique using HMM forward and backward chaining for the DFS in cloud. Asian Journal of Computer Science and Technology, 2017, 6(2):23-26. [47] Tran N, Reed D A. Automatic ARIMA time series modeling for adaptive I/O prefetching. IEEE Transactions on Parallel and Distributed Systems, 2004, 15(4):362-377. [48] Matthieu D, Ibrahim S, Antoniu G, Ross R. Omnisc'IO:A grammar-based approach to spatial and temporal I/O patterns prediction. In Proc. the International Conference for High Performance Computing, Networking, Storage and Analysis, November 2014, pp.623-634. [49] Luo Y F, Shi J, Zhou S G. JeCache:Just-enough data caching with just-in-time prefetching for big data applications. In Proc. the 37th IEEE International Conference on Distributed Computing Systems, June 2017, pp.2405-2410. [50] Daniel G, Sunyé G, Cabot J. PrefetchML:A framework for prefetching and caching models. In Proc. the 19th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, October 2016, pp.318-328. [51] Xu R, Jin X, Tao L F, Guo S Z, Xiang Z K, Tian T. An efficient resource-optimized learning prefetcher for solid state drives. In Proc. the 2018 Design, Automation & Test in Europe Conference & Exhibition, March 2018, pp.273-276. [52] Wu K, Huang Y C, Li D. Unimem:Run-time data management on non-volatile memory-based heterogeneous main memory. In Proc. the 2017 International Conference for High Performance Computing, Networking, Storage and Analysis, November 2017, Article No. 58. [53] Snyder B, Bosanac D, Davies R. Introduction to Apache ActiveMQ. In Active MQ in Action, Snyder B, Bosanac D, Davies R (eds.), Manning Publications, 2011, pp.6-16. [54] Kreps J, Narkhede N, Rao J. Kafka:A distributed messaging system for log processing. In Proc. the 6th Workshop on Networking Meets Databases, June 2011, pp.1-7. [55] Zawislak D, Toonen B, Allcock W, Rizzi S, Insley J, Vishwanath V, Papka M E. Early investigations into using a remote RAM pool with the vl3 visualization framework. In Proc. the 2nd Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, November 2016, pp.23-28. [56] Carns P, Latham R, Ross R, Iskra K, Lang S, Riley K. 24/7 characterization of petascale I/O workloads. In Proc. the 2009 IEEE International Conference on Cluster Computing, August 2009, Article No. 73. [57] Rao D S, Kumar S, Keshavamurthy A, Lantz P, Reddy D, Sankaran R, Jackson J. System software for persistent memory. In Proc. the 9th Eurosys Conference, April 2014, Article No. 15. [58] Qreshi M K, Srinivasan V, Rivers J A. Scalable high performance main memory system using phase-change memory technology. ACM SIGARCH Computer Architecture News, 2009, 37(3):24-33. [59] Berriman G B, Good J C, Laity A C, Kong M. The Montage image mosaic service:Custom image mosaics on-demand. In Proc. the 2007 Conference on Astronomical Data Analysis Software and Systems, September 2007, pp.83-102. [60] Strukov D B, Snider G S, Stewart D R, Williams R S. The missing memristor found. Nature, 2008, 453(7191):80-83. [61] Joo Y, Ryu J, Park S, Shin K G. FAST:Quick application launch on solid-state drives. In Proc. the 9th USENIX Conference on File and Storage Technologies, February 2011, pp.259-272. [62] Maghraoui K E, Kandiraju G, Jann J, Pattnaik P. Modeling and simulating flash based solid-state disks for operating systems. In Proc. the 1st Joint WOSP/SIPEW International Conference on Performance Engineering, January 2010, pp.15-26. [63] Andersen D G, Franklin J, Kaminsky M, Phanishayee A, Tan L, Vasudevan V. FAWN:A fast array of wimpy nodes. In Proc. the 22nd ACM SIGOPS Symposium on Operating Systems Principles, October 2009, pp.1-14. [64] Chen S. FlashLogging:Exploiting flash devices for synchronous logging performance. In Proc. the 2009 ACM SIGMOD International Conference on Management of Data, June 2009, pp.73-86. [65] Bhimji W, Bard D, Romanus M et al. Accelerating science with the NERSC burst buffer early user program. In Proc. the 2016 Cray User Group, May 2016. [66] Kang S, Park S, Jung H, Shim H, Cha J. Performance tradeoffs in using NVRAM write buffer for flash memory-based storage devices. IEEE Transactions on Computers, 2008, 58(6):744-758. [67] Caulfield A M, De A, Coburn J, Mollow T I, Gupta R K, Swanson S. Moneta:A high-performance storage array architecture for next-generation, non-volatile memories. In Proc. the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, December 2010, pp.385-395. [68] Akel A, Caulfield A M, Mollov T I, Gupta R K, Swanson S. Onyx:A prototype phase change memory storage array. In Proc. the 3rd USENIX Workshop on Hot Topics in Storage and File Systems, June 2011, Article No. 8. [69] Dong X Y, Muralimanohar N, Jouppi N, Kaufmann R, Xie Y. Leveraging 3D PCRAM technologies to reduce checkpoint overhead for future exascale systems. In Proc. the 2009 Conference on High Performance Computing Networking, Storage and Analysis, November 2009, Article No. 57. [70] Wang T, Oral S, Wang Y D, Settlemyer B, Atchley S, Yu W K. BurstMem:A high-performance burst buffer system for scientific applications. In Proc. the 2014 IEEE International Conference on Big Data, October 2014, pp.71-79. [71] Sato K, Mohror K, Moody A, Gamblin T, de Supinski B R, Maruyama N, Matsuoka S. A user-level InfiniBand-based file system and checkpoint strategy for burst buffers. In Proc. the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2014, pp.21-30. [72] Ma X S, Winslett M, Lee J, Yu S K. Faster collective output through active buffering. In Proc. the 16th International Parallel and Distributed Processing Symposium, April 2001, Article No. 44. [73] Ma X S, Winslett M, Lee J, Yu S K. Improving MPI-IO output performance with active buffering plus threads. In Proc. the 17th International Parallel and Distributed Processing Symposium, April 2003, Article No. 68. [74] Pai V S, Druschel P, Zwaenepoel W. IO-Lite:A unified I/O buffering and caching system. In Proc. the 3rd USENIX Symposium on Operating Systems Design and Implementation, February 1999, pp.15-28. [75] Nitzberg B, Lo V. Collective buffering:Improving parallel I/O performance. In Proc. the 6th IEEE International Symposium on High Performance Distributed Computing, August 1997, pp.148-157. [76] Bent J, Gibson G, Grider G, McClelland B, Nowoczynski P, Nunez J, Polte M, Wingate M. PLFS:A checkpoint filesystem for parallel applications. In Proc. the 2009 ACM/IEEE Conference on High Performance Computing Networking, Storage and Analysis, November 2009, Article No. 21. [77] Dong B, Byna S, Wu K, Johansen H, Johnson J N, Keen N. Data elevator:Low-contention data movement in hierarchical storage system. In Proc. the 23rd International Conference on High Performance Computing, December 2016, pp.152-161. [78] Wang T, Byna S, Dong B, Tang H. UniviStor:Integrated hierarchical and distributed storage for HPC. In Proc. the 2018 IEEE International Conference on Cluster Computing, September 2018, pp.134-144. [79] Lee D, Choi J, Kim J H, Noh S H, Min S L, Cho Y, Kim C S. LRFU:A spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE Transactions on Computers, 2001, 50(12):1352-1361. |
No related articles found! |
|