DaCe - Data Centric Parallel Programming
Creators
- Ben-Nun, Tal
- de Fine Licht, Johannes
- Nikolaos Ziogas, Alexandros
- Schneider, Timo
- Hoefler, Torsten
- De Matteis, Tiziano
- Hofer, Dominic
- Haag, Roman
- Niederer, Silvan
- Raje, Saurabh
- Schaad, Philipp
- Rausch, Oliver
- Gavrilas, Gabriel
- Ivanov, Andrei
- Lavarini, Luca
- Burger, Manuel
- Kleine, Jan
- Groner, Linus
- Backes, Thierry
- Anklin, Valentin
- Kuster, Andreas
- Baumann, Thomas
- Ates, Berke
- Scholbe, Stefan
- Johnsen, Carl-Johannes
- Widmer, Jannis
- Walo, Neville
- Trümper, Lukas
Description
What's Changed
This release brings forth a major change to how SDFGs are simplified in DaCe, using the Simplify pass pipeline. This both improves the performance of DaCe's transformations and introduces new types of simplification, such as dead dataflow elimination.
Please let us know if there are any regressions with this new release.
Features- Breaking change: The experimental
dace.constant
type hint has now achieved stable status and was renamed todace.compiletime
- Major change: Only modified configuration entries are now stored in
~/.dace.conf
. The SDFG build folders still include the full configuration file. Old.dace.conf
files are detected and migrated automatically. - Detailed, multi-platform performance counters are now available via native LIKWID instrumentation (by @lukastruemper in https://github.com/spcl/dace/pull/1063). To use, set
.instrument
todace.InstrumentationType.LIKWID_Counters
- GPU Memory Pools are now supported through CUDA's
mallocAsync
API. To enable, setdesc.pool = True
on any GPU data descriptor. - Map schedule and array storage types can now be annotated directly in Python code (by @orausch in https://github.com/spcl/dace/pull/1088). For example: ```python import dace from dace.dtypes import StorageType, ScheduleType
N = dace.symbol('N')
@dace def add_on_gpu(a: dace.float64[N] @ StorageType.GPU_Global, b: dace.float64[N] @ StorageType.GPU_Global):
This map will become a GPU kernelfor i in dace.map[0:N] @ ScheduleType.GPU_Device: b[i] = a[i] + 1.0
* Customizing GPU block dimension and OpenMP threading properties per map is now supported
* Optional arrays (i.e., arrays that can be None) can now be annotated in the code. The simplification pipeline also infers non-optional arrays from their use and can optimize code by eliminating branches. For example:
```python
@dace
def optional(maybe: Optional[dace.float64[20]], always: dace.float64[20]):
always += 1 # "always" is always used, so it will not be optional
if maybe is None: # This condition will stay in the code
return 1
if always is None: # This condition will be eliminated in simplify
return 2
return 3
Minor changes
- Miscellaneous fixes to transformations and passes
- Fixes for string literal (
"string"
) use in the Python frontend einsum
is now a library node- If CMake is already installed, it is now detected and will not be installed through
pip
- Add kernel detection flag by @TizianoDeMatteis in https://github.com/spcl/dace/pull/1061
- Better support for
__array_interface__
objects by @gronerl in https://github.com/spcl/dace/pull/1071 - Replacements look up base classes by @tbennun in https://github.com/spcl/dace/pull/1080
Full Changelog: https://github.com/spcl/dace/compare/v0.13.3...v0.14
Notes
Files
spcl/dace-v0.14.zip
Files
(2.1 MB)
Name | Size | Download all |
---|---|---|
md5:c0f809f945c3a6089d30204fc1a4bb80
|
2.1 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/spcl/dace/tree/v0.14 (URL)