There is a newer version of the record available.

Published June 13, 2026 | Version v1.7.0
Software Open

PhenoPhaseR: Reproducible processing workflow for interpolating phenological DWD observations

  • 1. Julius Kühn-Institut (JKI) – Federal Research Centre for Cultivated Plants
  • 2. Federal Agency for Nature Conservation (BfN)

Description

PhenoPhaseR is a reproducible R workflow for downloading, filtering, modelling, and spatially interpolating phenological observations provided by the German Weather Service (DWD). It implements the PHASE approach (Gerstmann et al. 2016), which combines growing degree day models with geostatistical interpolation to produce area-wide phenological predictions across Germany at 1 km spatial resolution.

Notes

  • Since v1.2.0, two in-pipeline publish hooks package the intermediate filter variant results and the final PHASE entry-date COGs as self-contained RO-Crate 1.2 deposits with W3C-anchored provenance (PROV-O), dataset descriptors (DCAT 3 / Dublin Core Terms), and quality metadata (DQV with SKOS bridge to ISO 19157-1).

  • As of v1.3.0, Hook B aggregates the per-(phase, year) intermediate outputs into per-phase multi-band Cloud-Optimised GeoTIFFs (one band per year, named by year) and wide-format per-phase VAM CSVs, reducing the published artefact count from ~896 to 21 files while preserving the per-year ISO 19157-1 quality elements through DQV temporal tagging.

  • v1.4.0 adds two new published quality artefacts to Hook B (CAL, in-sample BAM model-fit diagnostics; GEM, spatial quantiles of the BSE uncertainty raster), auto-writes a README.md into each crate documenting its validation stance, and generates a self-contained ro-crate-preview.html for browser-based inspection without external tooling.

  • v1.5.0 restructures the DQV quality metadata: each per-year quality measurement is now a first-class top-level entity in the JSON-LD @graph with its own @id, and the per-phase Datasets reference them by @id only — satisfying JSON-LD's node-reference rule and resolving 33 REQUIRED-severity violations previously reported by roc-validator.

  • v1.6.0 turns the publish hooks into a multi-crop blueprint: a single shared _crop_specs.R file holds per-crop metadata (DWD Plant ID, binomial, AGROVOC concept URI, Wikidata QID) for seven crops (winter wheat, winter rye, winter barley, winter rapeseed, spring barley, oats, maize), a family-wide creators list with explicit DataCite roles, and a layered keyword scheme; both builders are parametric on this configuration. Domain semantics are added via AGROVOC subject terms (schema:about / dct:subject) with DefinedTerm and DefinedTermSet entities in the @graph, alongside the existing W3C / ISO 19157 vocabulary stack.

  • v1.6.1 is a patch release fixing a promise already under evaluation: recursive default argument reference crash in the v1.6.0 publish hooks (the parameter crop_spec is renamed to crop to break a self-shadowing default); switching the Hook B (PHASE) deposit to a flat layout by default so downstream pipelines can stream individual multi-band COGs from Zenodo via GDAL's /vsicurl/ without downloading the full deposit; and documenting the intentional Schema.org http:// namespace choice in both builders. No numerical changes from v1.6.0; no changes to the multi-crop blueprint or the AGROVOC integration.

  • v1.6.2 adds resilient temporal-gap handling so a (phase, year) whose interpolation fails because too few stations survived filtering no longer aborts the run: the filter-variant selector logs the dropped cell and reason to opt_scores/GAPS_<plant>.csv, the interpolation writes a full-extent NA surface instead of calling stop(), and the Hook B publisher detects every all-NA band as an ISO 19157-1 DQ_CompletenessOmission measurement. Every COG remains a complete 32-band cube where band i maps to year i. v1.6.2 also adds sugar beet (DWD plant ID 253, Beta vulgaris ssp. vulgaris var. saccharifera) to the blueprint, bringing the deposit family to eight crops; introduces two optional visualization helpers (plot_phenology_raster_maps.R for a DOY/BSE raster pair over a named AOI, and plot_phenology_window_timeseries.R for an AOI-aggregated phase-window time-series across 1993–2024); corrects the RO-Crate profile claim from Workflow Run Crate to Process Run Crate (the WRROC base profile that the existing CreateAction already satisfies); and suppresses GDAL PAM sidecars (.tif.aux.json.tif.aux.xml) from the Hook B deposit since their content is already in the COG's TIFF tags. No numerical changes to any output for the existing seven crops.

  • v1.6.3 removes the AGROVOC controlled-vocabulary subject anchors introduced in v1.6.0 after the hand-curated URIs in _crop_specs.R were found to point at unrelated concepts when resolved against agrovoc.fao.org — e.g. the URI labelled "winter rye" resolved to "sawlogs", the URI labelled "phenology" resolved to "local authorities", the URI labelled "Germany" was a different geographic concept. Only the winter-wheat URI verified correctly. Rather than swap to another vocabulary (Wikidata, GEMET) and inherit the same verification burden — controlled-vocabulary subject anchoring is supposed to prevent exactly the typo-points-at-wrong-concept bug we found — the entire subject-anchor layer is removed. The v1.6.3 deposits use only free-text keywords (indexed by Zenodo, OpenAIRE, BonaRes, and Google Dataset Search) and the GeoNames spatialCoverage URI for subject anchoring; the deposits remain valid RO-Crate 1.2 since schema:about is optional in the profile. v1.6.3 also makes the auto-written PHASE README's gap-handling section build-aware, separating the mechanism description (unchanged across deposits) from a conditional "What this deposit reports" subsection that states whether the gap chain actually fired for the build at hand. Already-published v1.6.1 and v1.6.2 crop deposits retain their original schema:about entries on Zenodo as historical record; the cleanup lands on their next coordinated re-release. No numerical changes to any output, no changes to the publish-hook machinery, the PROV-O lineage, or the DQV quality measurements.

  • v1.7.0 reinstates the AGROVOC subject layer — this time with every URI verified against agrovoc.fao.org and a build-time guard (verify_agrovoc_uris()) that resolves each URI against the live catalogue and reports any that no longer match the concept they claim. Of the original v1.6.0 set, only winter wheat (c_8412) and maize (c_12332) had been correct; the other seven URIs are corrected (e.g. phenology c_28793 → c_5774, growing degree days c_36099 → c_28c4c002, Germany c_3258 → c_3245). Winter rye and winter rapeseed anchor to the generic species concept because AGROVOC has no winter-specific entry — the season is carried by the DWD Plant ID and the free-text keyword. "Spatial interpolation" has no AGROVOC concept and remains a free-text keyword. PHASE deposits additionally anchor spatial data (c_379bbe9f) and statistical uncertainty (c_28975, for the BSE layer). v1.7.0 also makes all DQV quality measures anchor honestly: schema:propertyID carries the bare measure token rather than a fabricated iso19157:<measure> IRI (no such concepts exist in ISO 19157), with the ISO link made only at the genuine dimension-class level via dqv:inDimension. Finally, v1.7.0 adds a validated prediction-interval calibration of the BSE uncertainty layer (PIC: PICP and MPIW at nominal 90%), computed by a dedicated k-fold cross-validation over all stations, separate from the production all-data fit so the published rasters are never degraded — each station is held out exactly once, and the interval combines each fold's se.fit with the production residual variance. These are surfaced as DQV measurements under DQ_UsabilityElement tied to the BSE COG via dqv:computedOn. This distinguishes what the BSE layer is (a model-internal statistical uncertainty) from how well-calibrated it turned out (a validated coverage statement). No numerical changes to any output. Already-published deposits keep their original encoding as historical record; the existing records pick up the v1.7.0 metadata.

Files

JKI-GDM/PhenoPhaseR-v1.7.0.zip

Files (174.1 kB)

Name Size Download all
md5:c15b02a924a4be3034f63e35a2ff1c78
174.1 kB Preview Download

Additional details

Related works

Is derived from
Dataset: 10.5281/zenodo.18594963 (DOI)
Is described by
Journal article: 10.1016/j.compag.2016.07.032 (DOI)
Is source of
Dataset: 10.5281/zenodo.19571847 (DOI)
Dataset: 10.5281/zenodo.19483111 (DOI)
Is supplement to
Software: https://github.com/JKI-GDM/PhenoPhaseR/tree/v1.7.0 (URL)

Funding

Deutsche Forschungsgemeinschaft
FAIRe Dateninfrastruktur für die Agrosystemforschung 501899475