LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Understanding the Impact of Data Staging for Coupled Scientific Workflows

Photo by campaign_creators from unsplash

The rate of data generated by cutting-edge experimental science facilities and large-scale simulations enabled by current high-performance computing (HPC) systems has continued to grow at a far greater pace than… Click to show full abstract

The rate of data generated by cutting-edge experimental science facilities and large-scale simulations enabled by current high-performance computing (HPC) systems has continued to grow at a far greater pace than the development of the network and storage capabilities on which these systems rely. To cope with this challenge, scientist are moving toward the creation of autonomous experiments and HPC simulations using machine learning. However, efficiently moving, storing, and processing large amounts of data away from the point of origin presents an incredible challenge. In-memory computing, in situ analysis, data staging, and data streaming are recognized viable alternatives to traditional file-based methods for transferring data between coupled workflows. However, the performance trade-offs and limitations for these methods are not fully understood when used in HPC applications. This paper presents a comprehensive performance assessment of the current solutions for data staging when applied to applications that are not necessary I/O intensive which makes them not ideal candidates for these methods. Our study is based on experiments running at scale on Oak Ridge National Laboratory's Summit supercomputer using applications and simulations that cover typical computational motifs and patterns. We investigated the usability and cost/benefit trade-offs of staging algorithms for HPC applications under different scenarios and highlight opportunities for optimizing the dataflow between coupled simulation workflows.

Keywords: staging coupled; impact data; coupled scientific; understanding impact; data staging; scientific workflows

Journal Title: IEEE Transactions on Parallel and Distributed Systems
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.