Published November 7, 2019 | Version v1
Poster Open

Feasibility tests of RoCE for the cluster-based event building in LHCb

Description

This paper evaluates the utilization of RDMA over Converged Ethernet (RoCE) for the Run3 LHCb event building at CERN. The acquisition system of the detector will collect partial data from approximately 1000 separate detector streams. Total estimated throughput equals 40 terabits per second. Full events will be assembled for subsequent processing and data selection in the filtering farm of the online trigger. As a result, inter-node large-throughput transmissions with a combination of 100 and 25 Gigabit-per-second will be essential features of the system. Therefore, the data exchange mechanism of the cluster must utilize memory-lightweight data transmission protocols. In this work, the RoCE high-throughput kernel bypass Ethernet-based protocol is benchmarked as an applicable technology for the event building network. CPU and memory bandwidth utilization for RoCE-based data transmissions is investigated and discussed. A comparison of RoCE with InfiniBand protocol is presented. Preliminary performance results are discussed with the selected network hardware supporting the protocol. Relevant utilization and interoperability issues are detailed along with lessons learned along the road.

Files

CHEP2019_475.pdf

Files (695.1 kB)

Name Size Download all
md5:f12bc7a9788fcf72b0a6212dbf5d8ca6
695.1 kB Preview Download