Ultra-low-latency FPGA-accelerated neural netweork inference at 40MHz at CMS

Choudhury, Diptarko; Ardino, Rocco; Owen James, Thomas

doi:10.5281/zenodo.10375098

Published December 14, 2023 | Version v1

Report Open

Ultra-low-latency FPGA-accelerated neural netweork inference at 40MHz at CMS

In the realm of data processing and physics analysis at the Large Hadron Collider (LHC), deep learning based algorithms have proven to be more advantageous than traditional physics based algorithms in certain cases [2]. This study explores cutting-edge methodologies for the low latency neural network inference on Field Programmable Gate Array (FPGA) devices. Specifically, the study focuses on muon primitive recalibration and fake/real muon pair classification at the rate of 40 MHz within the CMS L1 trigger system. The primary objective of this work is to develop an low-latency neural network model, strategically combining various techniques such as quantization aware training, knowledge distillation, transfer learning, and pruning schedules to reduce the computational footprint when compared to the preexisting baseline while simultaneously improving on reconstruction performance. Using the said strategy, the models were compressed over four times while still achieving significantly lower error rates than given baselines.

Files

40_MHZ_CMS_L1_SCOUTING_Diptarko_Choudhury.pdf

Files (9.0 MB)

Name	Size	Download all
40_MHZ_CMS_L1_SCOUTING_Diptarko_Choudhury.pdf md5:44bb88762ffc42fccfbf9a02010cd73f	9.0 MB	Preview Download

	All versions	This version
Views	149	149
Downloads	174	174
Data volume	2.0 GB	2.0 GB

Ultra-low-latency FPGA-accelerated neural netweork inference at 40MHz at CMS

Authors/Creators

Description

Files

40_MHZ_CMS_L1_SCOUTING_Diptarko_Choudhury.pdf

Files (9.0 MB)