When high throughput and utilization of fabric at close-to-the-link capacity are most needed in a cluster, Ethernet is a potential candidate, rivaling traditional HPC interconnects. The distributed real-time data acquisition… Click to show full abstract
When high throughput and utilization of fabric at close-to-the-link capacity are most needed in a cluster, Ethernet is a potential candidate, rivaling traditional HPC interconnects. The distributed real-time data acquisition at particle physics experiments presents an interesting use case. This paper evaluates possible Ethernet-based solutions for aggregating data from hundreds of data sources at a throughput of dozens of Tb/s. This leads us to many-to-one data exchanges where we strive for a cost-optimized setup sustaining more than 80 % of the theoretical link-load. We investigate possible Ethernet-based traffic patterns to handle data acquisition on large multi-source apparatuses. Different numbers of producers and receivers and different link speeds are allowed in a large-scale network. Performance tests were conducted using customized benchmarks and evaluation test benches. The paper presents tested scenarios and problems encountered in practice. We describe how our findings influenced the design of a large production system at CERN. We also present relevant general conclusions for a broader range of applications of Ethernet in HPC.
               
Click one of the above tabs to view related content.