Accelerated Machine Learning as a Service for Particle Physics Computing

Javier Duarte; Burt Holzman; Sergo Jindariani; Thomas Klijnsma; Benjamin Kreis; Mia Liu; Kevin Pedro; Nhan Tran; Aristeidis Tsaris; Phil Harris; Dylan Rankin; Vladimir Loncar; Jennifer Ngadiuba; Maurizio Pierini; Suffian Khan; Brian Lee; Brandon Perez; Ted W. Way; Colin Versteeg; Scott Hauck; Shih-Chieh Hsu; Matthew Trahms; Dustin Werran; Zhenbin Wu

doi:10.5281/zenodo.3895029

Published December 14, 2019 | Version v1

Conference paper Open

Accelerated Machine Learning as a Service for Particle Physics Computing

1. University of California San Diego
2. Fermi National Accelerator Laboratory
3. MIT
4. CERN
5. Microsoft
6. University of Washington
7. University of Illinois at Chicago

Large-scale particle physics experiments face challenging demands for high- throughput computing resources both now and in the future. New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that requires minimal modification to the current computing model. As an example, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC. Using Microsoft Azure Machine Learning deploying Intel FPGAs to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework deployed as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600-700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.

Files

NeurIPS_ML4PS_2019_64.pdf

Files (301.4 kB)

Name	Size	Download all
NeurIPS_ML4PS_2019_64.pdf md5:530b873ddc85d6e60697e85c0b2091be	301.4 kB	Preview Download

Additional details

European Commission
mPP - machine learning for Particle Physics 772369

	All versions	This version
Views	285	285
Downloads	317	317
Data volume	96.4 MB	96.4 MB

Accelerated Machine Learning as a Service for Particle Physics Computing

Authors/Creators

Description

Files

NeurIPS_ML4PS_2019_64.pdf

Files (301.4 kB)

Additional details

Funding