DaCe - Data Centric Parallel Programming
Creators
- Ben-Nun, Tal
- de Fine Licht, Johannes
- Nikolaos Ziogas, Alexandros
- Schneider, Timo
- Hoefler, Torsten
- De Matteis, Tiziano
- Hofer, Dominic
- Haag, Roman
- Niederer, Silvan
- Raje, Saurabh
- Schaad, Philipp
- Rausch, Oliver
- Gavrilas, Gabriel
- Ivanov, Andrei
- Lavarini, Luca
- Burger, Manuel
- Kleine, Jan
- Groner, Linus
- Backes, Thierry
- Anklin, Valentin
- Kuster, Andreas
- Baumann, Thomas
- Ates, Berke
- Scholbe, Stefan
- Johnsen, Carl-Johannes
- Widmer, Jannis
- Walo, Neville
- Trümper, Lukas
Description
The schedule type of a scope (e.g., a Map) is now also determined by the surrounding storage. If the surrounding storage is ambiguous, dace will fail with a nice exception. This means that codes such as the one below:
@dace.program
def add(a: dace.float32[10, 10] @ dace.StorageType.GPU_Global,
b: dace.float32[10, 10] @ dace.StorageType.GPU_Global):
return a + b @ b
will now automatically run the +
and @
operators on the GPU.
(#1262 by @tbennun)
DaCe ProfilerEasier interface for profiling applications: dace.profile
and dace.instrument
can now be used within Python with a simple API:
with dace.profile(repetitions=100) as profiler:
some_program(...)
# ...
other_program(...)
# Print all execution times of the last called program (other_program)
print(profiler.times[-1])
Where instrumentation is applied can be controlled with filters in the form of strings and wildcards, or with a function:
with dace.instrument(dace.InstrumentationType.GPU_Events,
filter='*add??') as profiler:
some_program(...)
# ...
other_program(...)
# Print instrumentation report for last call
print(profiler.reports[-1])
With dace.builtin_hooks.instrument_data
, the same technique can be applied to instrument data containers.
(#1197 by @tbennun)
Improved Data InstrumentationData container instrumentation can further now be used conditionally, allowing saving and restoring of data container contents only if certain conditions are met. In addition to this, data instrumentation now saves the SDFG's symbol values at the time of dumping data, allowing an entire SDFG's state / context to be restored from data reports.
(#1202, #1208 by @phschaad)
Restricted SSA for Scalars and SymbolsTwo new passes (ScalarFission
and StrictSymbolSSA
) allow fissioning of scalar data containers (or arrays of size 1) and symbols into separate containers and symbols respectively, based on the scope or reach of writes to them. This is a form of restricted SSA, which performs SSA wherever possible without introducing Phi-nodes. This change is made possible by a set of new analysis passes that provide the scope or reach of each write to scalars or symbols.
(#1198, #1214 by @phschaad)
Extending Cutout CapabilitiesSDFG Cutouts can now be taken from more than one state.
Additionally, taking cutouts that only access a subset of a data containre (e.g., A[2:5]
from a data container A
of size N
) results in the cutout receiving an "Alibi Node" to represent only that subset of the data (A_cutout[0:3] -> A[2:5]
, where A_cutout
is of size 4). This allows cutouts to be significantly smaller and have a smaller memory footprint, simplifying debugging and localized optimization.
Finally, cutouts now contain an exact description of their input and output configuration. The input configuration is anything that may influence a cutout's behavior and may contain data before the cutout is executed in the context of the original SDFG. Similarly, the output configuration is anything that a cutout writes to, that may be read externally or may influence the behavior of the remaining SDFG. This allows isolating all side effects of changes to a particular cutout, allowing transformations to be tested and verified in isolation and simplifying debugging.
(#1201 by @phschaad)
Bug Fixes, Compatability Improvements, and Other Changes- SymPy 1.12 Compatibility by @alexnick83 in https://github.com/spcl/dace/pull/1256
- GPU Grid-Strided Tiling by @C-TC in https://github.com/spcl/dace/pull/1249
- Fix MapInterchange for Maps with dynamic inputs by @alexnick83 in https://github.com/spcl/dace/pull/1244
- Assortment of fixes for dynamic Maps on GPU (dynamic thread blocks) by @alexnick83 in https://github.com/spcl/dace/pull/1246
- Tuning Compatibility Fixes by @lukastruemper in https://github.com/spcl/dace/pull/1234
- Inline preprocessor command by @tbennun in https://github.com/spcl/dace/pull/1242
unsqueeze_memlet
fixes by @alexnick83 in https://github.com/spcl/dace/pull/1203- Fix-intermediate-nodes by @alexnick83 in https://github.com/spcl/dace/pull/1212
- Fix for LoopToMap when applied on multi-nested loops by @alexnick83 in https://github.com/spcl/dace/pull/1207
- Fix-nested-sdfg-deepcopy by @alexnick83 in https://github.com/spcl/dace/pull/1221
- Fix integer division in Python frontend by @tbennun in https://github.com/spcl/dace/pull/1196
- Fix augmented assignment on scalar in condition by @tbennun in https://github.com/spcl/dace/pull/1225
- Fix internal subscript access if already existed by @tbennun in https://github.com/spcl/dace/pull/1228
- Fix atomic operation detection for exactly-overlapping ranges by @tbennun in https://github.com/spcl/dace/pull/1230
- Fix-gpu-transform-copy-out by @alexnick83 in https://github.com/spcl/dace/pull/1231
- Fix-interstate-free-symbols by @alexnick83 in https://github.com/spcl/dace/pull/1238
- Fix nested access with nested symbol dependency by @alexnick83 in https://github.com/spcl/dace/pull/1239
- Fix import in the transformations tutorial. by @lamyiowce in https://github.com/spcl/dace/pull/1210
- LoopToMap detects shared transients by @alexnick83 in https://github.com/spcl/dace/pull/1200
- Faster CI and reachability checks for codecov.io by @tbennun in https://github.com/spcl/dace/pull/1213
- Map-fission-single-data-multi-connectors by @alexnick83 in https://github.com/spcl/dace/pull/1216
- Add library path to HIP CMake by @tbennun in https://github.com/spcl/dace/pull/1219
- BatchedMatMul: MKL gemm_batch support by @lukastruemper in https://github.com/spcl/dace/pull/1181
Full Changelog: https://github.com/spcl/dace/compare/v0.14.2...v0.14.3
Notes
Files
spcl/dace-v0.14.3.zip
Files
(28.4 MB)
Name | Size | Download all |
---|---|---|
md5:521f1639961072a39019ab774bccad9e
|
28.4 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/spcl/dace/tree/v0.14.3 (URL)