Published September 14, 2022
| Version v1
Presentation
Open
Basic Physics Analyses Implemented Using Apache Spark
Description
Apache Spark is a very successful open-source tool for data processing. This talk will focus on the use of Spark and its DataFrame API in the context of HEP. We will go through a few demos of some simple and outreach-style analyses implemented using Jupyter notebooks and the Spark Python API (PySpark). We will wrap up with a short discussion of the key features in Spark and its ecosystem that can be useful for Physics analysis and what still needs improvements.
Files
PyHEP2022_LucaCanali.pdf
Files
(976.6 kB)
Name | Size | Download all |
---|---|---|
md5:36afe410efde174932b5ee6b0d6fce33
|
976.6 kB | Preview Download |