We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Robert B. Ross, George Amvrosiadis, Philip Carns, Charles D. Cranor, Matthieu Dorier, Kevin Harms, Greg Ganger, Garth Gibson, Samuel K. Gutierrez, Robert Latham, Bob Robey, Dana Robinson, Bradley Settlemyer, Galen Shipman, Shane Snyder, Jerome Soumagne, Qing Zheng. Mochi: Composing Data Services for High-Performance Computing Environments[J]. Journal of Computer Science and Technology, 2020, 35(1): 121-144. DOI: 10.1007/s11390-020-9802-0
Citation: Robert B. Ross, George Amvrosiadis, Philip Carns, Charles D. Cranor, Matthieu Dorier, Kevin Harms, Greg Ganger, Garth Gibson, Samuel K. Gutierrez, Robert Latham, Bob Robey, Dana Robinson, Bradley Settlemyer, Galen Shipman, Shane Snyder, Jerome Soumagne, Qing Zheng. Mochi: Composing Data Services for High-Performance Computing Environments[J]. Journal of Computer Science and Technology, 2020, 35(1): 121-144. DOI: 10.1007/s11390-020-9802-0

Mochi: Composing Data Services for High-Performance Computing Environments

Funds: This work is in part supported by the Director, Office of Advanced Scientific Computing Research, Office of Science, of the U.S. Department of Energy under Contract No. DE-AC02-06CH11357; in part supported by the Exascale Computing Project under Grant No. 17-SC-20-SC, a joint project of the U.S. Department of Energy's Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation's exascale computing imperative; and in part supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through Advanced Computing (SciDAC) program.
More Information
  • Author Bio:

    Robert B. Ross is a senior computer scientist at Argonne National Laboratory, Lemont, and a senior fellow at the Northwestern-Argonne Institute for Science and Engineering at Northwestern University, Evanston. Dr. Ross's research interests are in system software and architectures for high-performance computing and data analysis systems, in particular storage systems and software for I/O and message passing. Rob received his Ph.D. degree in computer engineering from Clemson University in 2000. Rob was a recipient of the 2004 Presidential Early Career Award for Scientists and Engineers.

  • Received Date: June 30, 2019
  • Revised Date: November 01, 2019
  • Published Date: January 04, 2020
  • Technology enhancements and the growing breadth of application workflows running on high-performance computing (HPC) platforms drive the development of new data services that provide high performance on these new platforms, provide capable and productive interfaces and abstractions for a variety of applications, and are readily adapted when new technologies are deployed. The Mochi framework enables composition of specialized distributed data services from a collection of connectable modules and subservices. Rather than forcing all applications to use a one-size-fits-all data staging and I/O software configuration, Mochi allows each application to use a data service specialized to its needs and access patterns. This paper introduces the Mochi framework and methodology. The Mochi core components and microservices are described. Examples of the application of the Mochi methodology to the development of four specialized services are detailed. Finally, a performance evaluation of a Mochi core component, a Mochi microservice, and a composed service providing an object model is performed. The paper concludes by positioning Mochi relative to related work in the HPC space and indicating directions for future work.
  • [1]
    Venkatesan S, Aoulaiche M. Overview of 3D NAND technologies and outlook invited paper. In Proc. the 2018 NonVolatile Memory Technology Symposium, Oct. 2018, Article No. 15.
    [2]
    Hady F T, Foong A, Veal B, Williams D. Platform storage performance with 3D XPoint technology. Proceedings of the IEEE, 2017, 105(9):1822-1833.
    [3]
    Kim J, Dally W J, Scott S, Abts D. Technology-driven, highly-scalable dragonfly topology. ACM SIGARCH Comput. Architecture News, 2008, 36(3):77-88.
    [4]
    Besta M, Hoeer T. Slim Fly:A cost effective low-diameter network topology. In Proc. the Int. Conf. for High Performance Comput., Networking, Storage and Anal., November 2014, pp.348-359.
    [5]
    Flajslik M, Borch E, Parker M A. Megafly:A topology for exascale systems. In Proc. the 33rd International Conference on High Performance Computing, June 2018, pp.289-310.
    [6]
    Shpiner A, Haramaty Z, Eliad S, Zdornov V, Gafni B, Zahavi E. Dragonfly+:Low cost topology for scaling datacenters. In Proc. the 3rd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era, February 2017, pp.1-8.
    [7]
    Sivaraman G, Beard E, Vazquez-Mayagoitia A, Vishwanath V, Cole J. UV/vis absorption spectra database autogenerated for optical applications via the Argonne data science program. In Proc. the 2019 APS March Meeting, March 2019.
    [8]
    Lockwood G K, Hazen D, Koziol Q et al. Storage 2020:A vision for the future of HPC storage. Technical Report, National Energy Research Scientific Computing Center, 2017. https://escholarship.org/content/qt744479dp/qt744479dp.pdf,Sept.2019.
    [9]
    Seo S, Amer A, Balaji P et al. Argobots:A lightweight lowlevel threading and tasking framework. IEEE Transactions on Parallel and Distributed Systems, 2018, 29(3):512-526.
    [10]
    Soumagne J, Kimpe D, Zounmevo J, Chaarawi M, Koziol Q, Afsahi A, Ross R. Mercury:Enabling remote procedure call for high-performance computing. In Proc. the 2013 IEEE International Conference on Cluster Computing, September 2013, Article No. 50.
    [11]
    Das A, Gupta I, Motivala A. SWIM:Scalable weaklyconsistent infection-style process group membership protocol. In Proc. the 2002 International Conference on Dependable Systems and Networks, June 2002, pp.303-312.
    [12]
    Rudoff A. Persistent memory programming. Login:The Usenix Magazine, 2017, 42(2):34-40.
    [13]
    Carns P, Jenkins J, Cranor C, Atchley S, Seo S, Snyder S, Hoeer T, Ross R. Enabling NVM for data-intensive scientific services. In Proc. the 4th Workshop on Interactions of NVM/Flash with Operating Systems and Workloads, November 2016, Article No. 4.
    [14]
    Ghemawat S, Dean J. LevelDB-A fast and lightweight key/value database library by Google. https://github.com/google/leveldb,Sept.2019.
    [15]
    Olson M A, Bostic K, Seltzer M I. Berkeley DB. In Proc. the 1999 USENIX Annual Technical Conference, June 1999, pp.183-191.
    [16]
    Dorier M, Carns P, Harms K et al. Methodology for the rapid development of scalable HPC data services. In Proc. the 3rd Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems, November 2018, pp.76-87.
    [17]
    van der Walt S, Colbert S C, Varoquaux G. The NumPy array:A structure for efficient numerical computation. Computing in Science & Engineering, 2011, 13(2):22-30.
    [18]
    Rosenblum M, Ousterhout J K. The design and implementation of a log-structured file system. ACM Transactions on Computer Systems, 1992, 10(1):26-52.
    [19]
    Brun R, Rademakers F. ROOT-An object oriented data analysis framework. Nuclear Instruments and Methods in Physics Research Section A:Accelerators, Spectrometers, Detectors and Associated Equipment, 1997, 389(1/2):81-86.
    [20]
    Perez D, Cubuk E D, Waterland A, Kaxiras E, Voter A F. Long-time dynamics through parallel trajectory splicing. Journal of Chemical Theory and Computation, 2015, 12(1):18-28.
    [21]
    Sevilla M A, Maltzahn C, Alvaro P, Nasirigerdeh R, Settlemyer B W, Perez D, Rich D, Shipman G M. Programmable caches with a data management language and policy engine. In Proc. the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2018, pp.203-212.
    [22]
    Zheng Q, Cranor C D, Guo D H, Ganger G R, Amvrosiadis G, Gibson G A, Settlemyer B W, Grider G, Guo F. Scaling embedded in-situ indexing with deltaFS. In Proc. the 2018 International Conference for High Performance Computing, Networking, Storage and Analysis, November 2018, Article No. 3.
    [23]
    Greenberg H, Bent J, Grider G. MDHIM:A parallel key/value framework for HPC. In Proc. the 7th USENIX Workshop on Hot Topics in Storage and File Systems, July 2015, Article No. 10.
    [24]
    Weil S A, Leung A W, Brandt S A, Maltzahn C. RADOS:A scalable, reliable storage service for petabyte-scale storage clusters. In Proc. the 2nd International Petascale Data Storage Workshop, November 2007, pp.35-44.
    [25]
    Weil S A, Brandt S A, Miller E L, Long D D E, Maltzahn C. Ceph:A scalable, high-performance distributed file system. In Proc. the 7th USENIX Symposium on Operating Systems Design and Implementation, November 2006, pp.307-320.
    [26]
    Liu J L, Koziol Q, Butler G F, Fortner N, Chaarawi M, Tang H J, Byna S, Lockwood G K, Cheema R, Kallback-Rose K A, Hazen D, Prabhat. Evaluation of HPC application I/O on object storage systems. In Proc. the 3rd IEEE/ACM International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems, November 2018, pp.24-34.
    [27]
    Escriva R, Sirer E G. The design and implementation of the warp transactional file system. In Proc. the 13th USENIX Symposium on Networked Systems Design and Implementation, March 2016, pp.469-483.
    [28]
    Kunkel J, Betke E. An MPI-IO in-memory driver for nonvolatile pooled memory of the Kove XPD. In Proc. the 2017 International Workshops on High Performance Computing, June 2017, pp.679-690.
    [29]
    Latham R, Ross R B, Thakur R. Can MPI be used for persistent parallel services? In Proc. the 13th European PVM/MPI Users' Group Meeting, September 2006, pp.275-284.
    [30]
    Vef M A, Moti N, Süß T, Tocci T, Nou R, Miranda A, Cortes T, Brinkmann A. GekkoFS-A temporary distributed file system for HPC applications. In Proc. the 2018 IEEE International Conference on Cluster Computing, September 2018, pp.319-324.
    [31]
    Wang T, Mohror K, Moody A, Sato K, Yu W K. An ephemeral burst-buffer file system for scientific applications. In Proc. the 2016 International Conference for High Performance Computing, Networking, Storage and Analysis, November 2016, pp.807-818.
    [32]
    Tang H J, Byna S, Tessier F et al. Toward scalable and asynchronous object-centric data management for HPC. In Proc. the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2018, pp.113-122.
    [33]
    Intel Corporation. DAOS:Revolutionizing high-performance storage with Intel Optane technology. https://www.intel.com/content/dam/www/public/us/en/documents/solution-briefs/high-performance-storage-brief.pdf,June 2019.
    [34]
    Zhao D F, Zhang Z, Zhou X B, Li T L, Wang K, Kimpe D, Carns P, Ross R, Raicu I. FusionFS:Toward supporting data-intensive scientific applications on extreme-scale high-performance computing systems. In Proc. the 2014 IEEE International Conference on Big Data, October 2014, pp.61-70.
    [35]
    Docan C, Parashar M, Klasky S. DataSpaces:An interaction and coordination framework for coupled simulation workflows. Cluster Computing, 2011, 15(2):163-181.
    [36]
    Docan C, Parashar M, Klasky S. Enabling high-speed asynchronous data extraction and transfer using DART. Concurrency and Computation:Practice and Experience, 2010, 22(9):1181-1204.
    [37]
    Duro F R, Blas J G, Isaila F, Pérez J C, Wozniak J M, Ross R. Exploiting data locality in Swift/T workflows using Hercules. In Proc. the 1st Network for Sustainable Ultrascale Computing Workshop, October 2014.
    [38]
    Fitzpatrick B. Distributed caching with Memcached. Linux Journal, 2004, 2004(124):72-76.
    [39]
    Kim J, Lee S, Vetter J S. PapyrusKV:A high-performance parallel key-value store for distributed NVM architectures. In Proc. the 2017 International Conference for High Performance Computing, Networking, Storage and Analysis, November 2017, Article No. 57.
    [40]
    Frings W, Ahn D H, LeGendre M, Gamblin T, de Supinski B R, Wolf F. Massively parallel loading. In Proc. the 27th International ACM Conference on International Conference on Supercomputing, June 2013, pp.389-398.
    [41]
    Kougkas A, Devarajan H, Lofstead J, Sun X H. LABIOS:A distributed label-based I/O system. In Proc. the 28th International Symposium on High-Performance Parallel and Distributed Computing, June 2019, pp.13-24.
    [42]
    Anwar A, Cheng Y, Huang H, Han J, Sim H, Lee D, Douglis F, Butt A R. BESPOKV:Application tailored scale-out key-value stores. In Proc. the 2018 International Conference for High Performance Computing, Networking, Storage and Analysis, November 2018, Article No. 2.
    [43]
    Ulmer C, Mukherjee S, Templet G, Levy S, Lofstead J, Widener P, Kordenbrock T, Lawson M. Faodel:Data management for next-generation application workflows. In Proc. the 9th Workshop on Scientific Cloud Computing, June 2018, Article No. 8.
    [44]
    Sevilla M A, Watkins N, Jimenez I, Alvaro P, Finkelstein S, LeFevre J, Maltzahn C. Malacology:A programmable storage system. In Proc. the 12th European Conference on Computer Systems, April 2017, pp.175-190.
  • Related Articles

    [1]Xiao-Hui Wei, Shi-Yu Tong, Zhong-Ao Sun, Xiang Li, Heng-Shan Yue. ResCheckpointer: Building Program Error Resilience-Aware Checkpointing Mechanism for HPC Systems[J]. Journal of Computer Science and Technology, 2025, 40(3): 671-685. DOI: 10.1007/s11390-025-4634-6
    [2]Yu-Jin Yan, Hai-Bo Li, Tong Zhao, Lin-Wang Wang, Lin Shi, Tao Liu, Guang-Ming Tan, Wei-Le Jia, Ning-Hui Sun. 10-Million Atoms Simulation of First-Principle Package LS3DF[J]. Journal of Computer Science and Technology, 2024, 39(1): 45-62. DOI: 10.1007/s11390-023-3011-6
    [3]Zi-Xuan Ma, Yu-Yang Jin, Shi-Zhi Tang, Hao-Jie Wang, Wei-Cheng Xue, Ji-Dong Zhai, Wei-Min Zheng. Unified Programming Models for Heterogeneous High-Performance Computers[J]. Journal of Computer Science and Technology, 2023, 38(1): 211-218. DOI: 10.1007/s11390-023-2888-4
    [4]Rong Ge, Xizhou Feng, Pengfei Zou, Tyler Allen. The Paradigm of Power Bounded High-Performance Computing[J]. Journal of Computer Science and Technology, 2023, 38(1): 87-102. DOI: 10.1007/s11390-023-2885-7
    [5]Michèle Weiland, Bernhard Homölle. Usage Scenarios for Byte-Addressable Persistent Memory in High-Performance and Data Intensive Computing[J]. Journal of Computer Science and Technology, 2021, 36(1): 110-122. DOI: 10.1007/s11390-020-0776-8
    [6]Qi Chen, Kang Chen, Zuo-Ning Chen, Wei Xue, Xu Ji, Bin Yang. Lessons Learned from Optimizing the Sunway Storage System for Higher Application I/O Performance[J]. Journal of Computer Science and Technology, 2020, 35(1): 47-60. DOI: 10.1007/s11390-020-9798-5
    [7]André Brinkmann, Kathryn Mohror, Weikuan Yu, Philip Carns, Toni Cortes, Scott A. Klasky, Alberto Miranda, Franz-Josef Pfreundt, Robert B. Ross, Marc-André Vef. Ad Hoc File Systems for High-Performance Computing[J]. Journal of Computer Science and Technology, 2020, 35(1): 4-26. DOI: 10.1007/s11390-020-9801-1
    [8]Xu Tan, Xiao-Wei Shen, Xiao-Chun Ye, Da Wang, Dong-Rui Fan, Lunkai Zhang, Wen-Ming Li, Zhi-Min Zhang, Zhi-Min Tang. A Non-Stop Double Buffering Mechanism for Dataflow Architecture[J]. Journal of Computer Science and Technology, 2018, 33(1): 145-157. DOI: 10.1007/s11390-017-1747-6
    [9]Yuhun Jun, Jaemin Lee, Euiseong Seo. Evaluation of Remote-I/O Support for a DSM-Based Computation Offloading Scheme[J]. Journal of Computer Science and Technology, 2017, 32(5): 957-973. DOI: 10.1007/s11390-017-1775-2
    [10]Xiao-Wei Shen, Xiao-Chun Ye, Xu Tan, Da Wang, Lunkai Zhang, Wen-Ming Li, Zhi-Min Zhang, Dong-Rui Fan, Ning-Hui Sun. An Efficient Network-on-Chip Router for Dataflow Architecture[J]. Journal of Computer Science and Technology, 2017, 32(1): 11-25. DOI: 10.1007/s11390-017-1703-5
  • Others

  • Cited by

    Periodical cited type(69)

    1. Thomas Bouvier, Bogdan Nicolae, Alexandru Costan, et al. Efficient distributed continual learning for steering experiments in real-time. Future Generation Computer Systems, 2025, 162: 107438. DOI:10.1016/j.future.2024.07.016
    2. Ivy Peng, Jacob Wahlgren, Karim Youssef, et al. UMap: An application-oriented user level memory mapping library. The International Journal of High Performance Computing Applications, 2025, 39(2): 269. DOI:10.1177/10943420241303145
    3. Zhe Wang, Kenneth Moreland, Matthew Larsen, et al. In Situ Workload Estimation for Block Assignment and Duplication in Parallelization‐Over‐Data Particle Advection. Computer Graphics Forum, 2025. DOI:10.1111/cgf.70108
    4. Duck Bong Kim, Mahdi Sadeqi Bajestani, Ju Yeon Lee, et al. Human-in-the-loop in smart manufacturing (H-SM): A review and perspective. Journal of Manufacturing Systems, 2025, 82: 178. DOI:10.1016/j.jmsy.2025.05.020
    5. Kevin A. Huck, Sameer Shende, Allen D. Malony, et al. Preparing the TAU performance system for exascale and beyond. The International Journal of High Performance Computing Applications, 2025. DOI:10.1177/10943420251334456
    6. Njoud O. Al-Maaitah, Javier Garcia-Blas, Genaro Sanchez-Gallegos, et al. A comparative study of ad-hoc file systems for extreme scale computing. Future Generation Computer Systems, 2025, 170: 107815. DOI:10.1016/j.future.2025.107815
    7. Christopher Kelly, Wei Xu, Line C Pouchard, et al. Performance analysis and data reduction for exascale scientific workflows. The International Journal of High Performance Computing Applications, 2025. DOI:10.1177/10943420251316253
    8. Xinyu Chen, Jiannan Tian, Ian Beaver, et al. FCBench: Cross-Domain Benchmarking of Lossless Compression for Floating-Point Data. Proceedings of the VLDB Endowment, 2024, 17(6): 1418. DOI:10.14778/3648160.3648180
    9. Dewi Yokelson, Oskar Lappi, Srinivasan Ramesh, et al. SOMA: Observability, monitoring, and in situ analytics for exascale applications. Concurrency and Computation: Practice and Experience, 2024, 36(19) DOI:10.1002/cpe.8141
    10. Orcun Yildiz, Dmitriy Morozov, Arnur Nigmetov, et al. Wilkins: HPC in situ workflows made easy. Frontiers in High Performance Computing, 2024, 2 DOI:10.3389/fhpcp.2024.1472719
    11. Zhe Wang, Matthieu Dorier, Pradeep Subedi, et al. Adaptive elasticity policies for staging-based in situ visualization. Future Generation Computer Systems, 2023, 142: 75. DOI:10.1016/j.future.2022.12.010
    12. Matthieu Dorier, Zhe Wang, Srinivasan Ramesh, et al. Towards elastic in situ analysis for high-performance computing simulations. Journal of Parallel and Distributed Computing, 2023, 177: 106. DOI:10.1016/j.jpdc.2023.02.014
    13. Shiyue Huang, Yanzhao Qin, Xinyi Zhang, et al. Survey on performance optimization for database systems. Science China Information Sciences, 2023, 66(2) DOI:10.1007/s11432-021-3578-6
    14. Shiyue Huang, Ziwei Wang, Xinyi Zhang, et al. DBPA: A Benchmark for Transactional Database Performance Anomalies. Proceedings of the ACM on Management of Data, 2023, 1(1): 1. DOI:10.1145/3588926
    15. Philip Carns, Matthieu Dorier, Rob Latham, et al. Mochi: A Case Study in Translational Computer Science for High-Performance Computing Data Management. Computing in Science & Engineering, 2023, 25(4): 35. DOI:10.1109/MCSE.2023.3326436
    16. Jerome Soumagne, Jordan Henderson, Mohamad Chaarawi, et al. Accelerating HDF5 I/O for Exascale Using DAOS. IEEE Transactions on Parallel and Distributed Systems, 2022, 33(4): 903. DOI:10.1109/TPDS.2021.3097884
    17. Ivy B. Peng, Maya B. Gokhale, Karim Youssef, et al. Enabling Scalable and Extensible Memory-Mapped Datastores in Userspace. IEEE Transactions on Parallel and Distributed Systems, 2022, 33(4): 866. DOI:10.1109/TPDS.2021.3086302
    18. Qing Zheng, Charles D. Cranor, Ankush Jain, et al. Streaming Data Reorganization at Scale with DeltaFS Indexed Massive Directories. ACM Transactions on Storage, 2020, 16(4): 1. DOI:10.1145/3415581
    19. Scott Levy, Whit Schonbein, Craig Ulmer. Leveraging High-Performance Data Transfer to Offload Data Management Tasks to SmartNICs. 2024 IEEE International Conference on Cluster Computing (CLUSTER), DOI:10.1109/CLUSTER59578.2024.00037
    20. Thomas Bouvier, Bogdan Nicolae, Hugo Chaugier, et al. Efficient Data-Parallel Continual Learning with Asynchronous Distributed Rehearsal Buffers. 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid), DOI:10.1109/CCGrid59990.2024.00036
    21. Osamu Tatebe, Kazuki Obata, Kohei Hiraga, et al. CHFS: Parallel Consistent Hashing File System for Node-local Persistent Memory. International Conference on High Performance Computing in Asia-Pacific Region, DOI:10.1145/3492805.3492807
    22. Luke Logan, Jay Lofstead, Xian-He Sun, et al. An Evaluation of DAOS for Simulation and Deep Learning HPC Workloads. Proceedings of the 3rd Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems, DOI:10.1145/3578353.3589542
    23. Kevin Huck, Allen Malony. ZeroSum: User Space Monitoring of Resource Utilization and Contention on Heterogeneous HPC Systems. Proceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, DOI:10.1145/3624062.3624145
    24. Sarah Neuwirth. Assessment of the I/O and Storage Subsystem in Modular Supercomputing Architectures. 2022 IEEE International Conference on Cluster Computing (CLUSTER), DOI:10.1109/CLUSTER51413.2022.00077
    25. Alok Kamatar, Mansi Sakarvadia, Valerie Hayot-Sasson, et al. Lazy Python Dependency Management in Large-Scale Systems. 2023 IEEE 19th International Conference on e-Science (e-Science), DOI:10.1109/e-Science58273.2023.10254910
    26. Philip E. Davis, Pradeep Subedi, Shaohua Duan, et al. Benesh: a Programming Model for Coupled Scientific Workflows. 2020 IEEE/ACM Fifth International Workshop on Extreme Scale Programming Models and Middleware (ESPM2), DOI:10.1109/ESPM251964.2020.00008
    27. Zhe Wang, Pradeep Subedi, Matthieu Dorier, et al. Staging Based Task Execution for Data-driven, In-Situ Scientific Workflows. 2020 IEEE International Conference on Cluster Computing (CLUSTER), DOI:10.1109/CLUSTER49012.2020.00031
    28. Sajid Ali, Steven Calvez, Philip Carns, et al. HEPnOS: a Specialized Data Service for High Energy Physics Analysis. 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), DOI:10.1109/IPDPSW59300.2023.00108
    29. Srinivasan Ramesh, Allen D. Malony, Philip Carns, et al. SYMBIOSYS: A Methodology for Performance Analysis of Composable HPC Data Services. 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), DOI:10.1109/IPDPS49936.2021.00013
    30. Bo Zhang, Philip E Davis, Zhao Zhang, et al. Dual Channel Dual Staging: Hierarchical and Portable Staging for GPU-Based In-Situ Workflow. 2024 IEEE 31st International Conference on High Performance Computing, Data, and Analytics (HiPC), DOI:10.1109/HiPC62374.2024.00027
    31. Sohei Koyama, Kohei Hiraga, Osamu Tatebe. FINCHFS: Design of Ad-Hoc File System for I/O Heavy HPC Workloads. 2024 IEEE International Conference on Cluster Computing (CLUSTER), DOI:10.1109/CLUSTER59578.2024.00045
    32. Amal Gueroudji, Chase Phelps, Tanzima Z. Islam, et al. Performance Characterization and Provenance of Distributed Task-based Workflows on HPC Platforms. SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1109/SCW63240.2024.00254
    33. Zhe Wang, Matthieu Dorier, Pradeep Subedi, et al. An Adaptive Elasticity Policy For Staging Based In-Situ Processing. 2021 IEEE Workshop on Workflows in Support of Large-Scale Science (WORKS), DOI:10.1109/WORKS54523.2021.00010
    34. Ankush Jain, Charles D. Cranor, Qing Zheng, et al. CARP: Range Query-Optimized Indexing for Streaming Data. SC24: International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1109/SC41406.2024.00093
    35. Yi Ju, Dominik Huber, Adalberto Perez, et al. Recent Advances in the Message Passing Interface. Lecture Notes in Computer Science, DOI:10.1007/978-3-031-73370-3_7
    36. Noah Lewis, Kaushik Velusamy, Kevin Harms, et al. Copper: Cooperative Caching Layer for Scalable Data Loading in Exascale Supercomputers. SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1109/SCW63240.2024.00173
    37. Hariharan Devarajan, Kathryn Mohror. TailorFS: An Adaptive File System to Support Dynamic I/O requirements of HPC Workloads. 2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), DOI:10.1109/SBAC-PAD63648.2024.00016
    38. Tom Peterka, Dmitriy Morozov, Arnur Nigmetov, et al. LowFive: In Situ Data Transport for High-Performance Workflows. 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), DOI:10.1109/IPDPS54959.2023.00102
    39. Julien Adam, Jean-Baptiste Besnard, Adrien Roussel, et al. Recent Advances in the Message Passing Interface. Lecture Notes in Computer Science, DOI:10.1007/978-3-031-73370-3_6
    40. Jean-Baptiste Besnard, Sameer Shende, Allen Malony, et al. Enabling Global MPI Process Addressing in MPI Applications. Proceedings of the 29th European MPI Users' Group Meeting, DOI:10.1145/3555819.3555829
    41. Matthieu Dorier, Romain Egele, Prasanna Balaprakash, et al. HPC Storage Service Autotuning Using Variational- Autoencoder -Guided Asynchronous Bayesian Optimization. 2022 IEEE International Conference on Cluster Computing (CLUSTER), DOI:10.1109/CLUSTER51413.2022.00049
    42. Dominic Manno, Jason Lee, Prajwal Challa, et al. GUFI: Fast, Secure File System Metadata Search for Both Privileged and Unprivileged Users. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1109/SC41404.2022.00062
    43. Jaime Cernuda, Jie Ye, Anthony Kougkas, et al. HStream: A hierarchical data streaming engine for high-throughput scientific applications. Proceedings of the 53rd International Conference on Parallel Processing, DOI:10.1145/3673038.3673150
    44. Pascal Grosset, Jesus Pulido, James Ahrens. Personalized In Situ Steering for Analysis and Visualization. ISAV'20 In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, DOI:10.1145/3426462.3426463
    45. Zhenbo Qiao, Qing Liu, Norbert Podhorszki, et al. Taming I/O Variation on QoS-Less HPC Storage: What Can Applications Do?. SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1109/SC41405.2020.00015
    46. Genaro Sanchez-Gallegos, Javier Garcia-Blas, Cosmin Petre, et al. High Performance Computing. Lecture Notes in Computer Science, DOI:10.1007/978-3-031-40843-4_5
    47. Srinivasan Ramesh, Hank Childs, Allen Malony. SERVIZ: A Shared In Situ Visualization Service. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1109/SC41404.2022.00026
    48. Bogdan Nicolae, Tanzima Z. Islam, Robert Ross, et al. Building the I (Interoperability) of FAIR for Performance Reproducibility of Large-Scale Composable Workflows in RECUP. 2023 IEEE 19th International Conference on e-Science (e-Science), DOI:10.1109/e-Science58273.2023.10254808
    49. Philip Carns, Kevin Harms, Bradley W. Settlemyer, et al. Keeping It Real: Why HPC Data Services Don't Achieve I/O Microbenchmark Performance. 2020 IEEE/ACM Fifth International Parallel Data Systems Workshop (PDSW), DOI:10.1109/PDSW51947.2020.00006
    50. Luke Logan, Jaime Cernuda Garcia, Jay Lofstead, et al. LabStor: A Modular and Extensible Platform for Developing High-Performance, Customized I/O Stacks in Userspace. SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1109/SC41404.2022.00028
    51. Srinivasan Ramesh, Robert Ross, Matthieu Dorier, et al. SYMBIOMON: A High-Performance, Composable Monitoring Service. 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC), DOI:10.1109/HiPC53243.2021.00047
    52. Luke Logan, Anthony Kougkas, Xian-He Sun. MegaMmap: Blurring the Boundary Between Memory and Storage for Data-Intensive Workloads. SC24: International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1109/SC41406.2024.00114
    53. Pierre Matri, Robert Ross. Neon: Low-Latency Streaming Pipelines for HPC. 2021 IEEE 14th International Conference on Cloud Computing (CLOUD), DOI:10.1109/CLOUD53861.2021.00089
    54. Meghana Madhyastha, Robert Underwood, Randal Burns, et al. DStore: A Lightweight Scalable Learning Model Repository with Fine-Grain Tensor-Level Access. Proceedings of the 37th International Conference on Supercomputing, DOI:10.1145/3577193.3593730
    55. Ishan Abhinit, Emily K. Adams, Khairul Alam, et al. Novel Proposals for FAIR, Automated, Recommendable, and Robust Workflows. 2022 IEEE/ACM Workshop on Workflows in Support of Large-Scale Science (WORKS), DOI:10.1109/WORKS56498.2022.00016
    56. Matthieu Dorier, Philip Carns, Robert Ross, et al. Extending the Mochi Methodology to Enable Dynamic HPC Data Services. 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), DOI:10.1109/IPDPSW63119.2024.00091
    57. Zhenbo Qiao, Qirui Tian, Zhenlu Qin, et al. Tango: A Cross-layer Approach to Managing I/O Interference over Local Ephemeral Storage. SC24: International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1109/SC41406.2024.00020
    58. Robert Underwood, Meghana Madhyastha, Randal Burns, et al. EvoStore: Towards Scalable Storage of Evolving Learning Models. Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, DOI:10.1145/3625549.3658679
    59. Dewi Yokelson, Mikhail Titov, Srinivasan Ramesh, et al. Enabling Performance Observability for Heterogeneous HPC Workflows with SOMA. Proceedings of the 53rd International Conference on Parallel Processing, DOI:10.1145/3673038.3673100
    60. Matthieu Dorier, Zhe Wang, Utkarsh Ayachit, et al. Colza: Enabling Elastic In Situ Visualization for High-performance Computing Simulations. 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS), DOI:10.1109/IPDPS53621.2022.00059
    61. Pradeep Subedi, Philip E. Davis, Manish Parashar. RISE: Reducing I/O Contention in Staging-based Extreme-Scale In-situ Workflows. 2021 IEEE International Conference on Cluster Computing (CLUSTER), DOI:10.1109/Cluster48925.2021.00021
    62. Bo Zhang, Philip E. Davis, Nicolas Morales, et al. Euro-Par 2023: Parallel Processing. Lecture Notes in Computer Science, DOI:10.1007/978-3-031-39698-4_22
    63. Pascal Grosset, James Ahrens. Lightweight Interface for In Situ Analysis and Visualization of Particle Data. ISAV'21: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, DOI:10.1145/3490138.3490143
    64. Craig Ulmer, Jianshen Liu, Carlos Maltzahn, et al. Extending Composable Data Services into SmartNICs. 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), DOI:10.1109/IPDPSW59300.2023.00158
    65. Bogdan Nicolae, Justin M. Wozniak, Tekin Bicer, et al. Diaspora: Resilience-Enabling Services for Real-Time Distributed Workflows. 2024 IEEE 20th International Conference on e-Science (e-Science), DOI:10.1109/e-Science62913.2024.10678669
    66. Trevor Steil, Tahsin Reza, Benjamin Priest, et al. Embracing Irregular Parallelism in HPC with YGM. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1145/3581784.3607103
    67. Logan Ward, J. Gregory Pauloski, Valerie Hayot-Sasson, et al. Cloud Services Enable Efficient AI-Guided Simulation Workflows across Heterogeneous Resources. 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), DOI:10.1109/IPDPSW59300.2023.00018
    68. J. Gregory Pauloski, Valerie Hayot-Sasson, Logan Ward, et al. Accelerating Communications in Federated Applications with Transparent Object Proxies. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, DOI:10.1145/3581784.3607047
    69. Zhe Wang, Pradeep Subedi, Matthieu Dorier, et al. Adaptive Placement of Data Analysis Tasks For Staging Based In-Situ Processing. 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC), DOI:10.1109/HiPC53243.2021.00038

    Other cited types(0)

Catalog

    Article views (104) PDF downloads (0) Cited by(69)
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return