scikit-hep/uproot: 3.4.4
Authors/Creators
- 1. Southern Methodist University
- 2. University of Cincinnati
- 3. Max Planck Institute for Nuclear Physics
- 4. RWTH Aachen
Description
Faster TTree.pandas.df(flatten=True) provided by PR #223.
One capability that was lost was reading branches with different jagged structure into the same DataFrame with flatten=True. For instance, a TTree containing different numbers of electrons and muons can't be simultaneously flattened. The old code managed to do this with an outer join#Full_outer_join) on DataFrames. We no longer do this in the TTree.pandas.df code; instead, we broadcast JaggedArrays, which is not just faster, it's also more correct. Does it make sense to put the first electron and the first muon in the same row, then the second electron and the second muon in another row, where the two sets have different sizes in each event? (The shorter of the two then has to be padded with NaN.) This joint row-membership doesn't correspond to any property the second electron and second muon share.
Now there's a ValueError warning you if you try to do this. You can encounter this error rather easily by not specifying branches—implicitly saying you want all branches from a TTree, which may contain incompatible branches. Remember that you can use glob patterns to ask for all branches satisfying a name pattern.
If you really do want to mix different cardinalities in the same DataFrame, you can explicitly do an outer join in Pandas:
muons = tree.pandas.df("Muon_*", flatten=True)
electrons = tree.pandas.df("Electron_*", flatten=True)
muons.join(electrons, how="outer")
You can also choose to not flatten the DataFrame, which puts no constraints on the structure of the contents (but is less useful if you have a lot of jagged data).
Files
scikit-hep/uproot-3.4.4.zip
Files
(55.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:87c6df1af23498734e3a9e4e182dbd96
|
55.9 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/scikit-hep/uproot/tree/3.4.4 (URL)