For discovering changes within a dataset that may have affect downstream processes relying on consistent dataset structure and meaning. This is useful in workflow automation where reporting such changes can expedite trouble shooting and manual intervention.

compare_eml(newest, previous, return.all = FALSE)

Arguments

newest

(xml_document, xml_node) EML of the newest version of a data package, where inputs are returned from api_read_metadata().

previous

(xml_document, xml_node) EML of the previous version of a data package, where inputs are returned from api_read_metadata().

return.all

(logical) Return all differences? Default is FALSE, i.e. only return meaningful differences. Meaningful differences do not include elements expected to change between versions (e.g. number of rows, file size, temporal coverage).

Value

(character) XPaths of nodes that differ between versions

Details

XPaths of checked nodes (and whether "meaningful"):

  • .//dataset/abstract (TRUE)

  • .//dataset/coverage/geographicCoverage (FALSE)

  • .//dataset/coverage/temporalCoverage (FALSE)

  • .//dataset/coverage/taxonomicCoverage (FALSE)

  • .//dataset/keywordSet (FALSE)

  • .//dataTable/physical/objectName (TRUE)

  • .//dataTable/physical/size (FALSE)

  • .//dataTable/physical/authentication (FALSE)

  • .//dataTable/physical/dataFormat/textFormat/numHeaderLines (TRUE)

  • .//dataTable/physical/dataFormat/textFormat/recordDelimiter (FALSE)

  • .//dataTable/physical/dataFormat/textFormat/attributeOrientation (TRUE)

  • .//dataTable/physical/dataFormat/textFormat/simpleDelimited/fieldDelimiter (TRUE)

  • .//dataTable/attributeList (TRUE)

  • .//dataTable/numberOfRecords (FALSE)

  • .//otherEntity/physical/objectName (TRUE)

  • .//otherEntity/physical/size (FALSE)

  • .//otherEntity/physical/authentication (FALSE)

  • .//otherEntity/physical/dataFormat/textFormat/numHeaderLines (TRUE)

  • .//otherEntity/physical/dataFormat/textFormat/recordDelimiter (TRUE)

  • .//otherEntity/physical/dataFormat/textFormat/attributeOrientation (TRUE)

  • .//otherEntity/physical/dataFormat/textFormat/simpleDelimited/fieldDelimiter (TRUE)

  • .//otherEntity/attributeList (TRUE)

Examples

# Return only "meaningful" differences (default behavior) compare_eml( newest = api_read_metadata("knb-lter-hfr.118.32"), previous = api_read_metadata("knb-lter-hfr.118.31"))
#> Retrieving EML for data package knb-lter-hfr.118.32
#> Retrieving EML for data package knb-lter-hfr.118.31
#> [1] ".//dataTable[1]/attributeList"
# Return all differences compare_eml( newest = api_read_metadata("knb-lter-hfr.118.32"), previous = api_read_metadata("knb-lter-hfr.118.31"), return.all = TRUE)
#> Retrieving EML for data package knb-lter-hfr.118.32
#> Retrieving EML for data package knb-lter-hfr.118.31
#> [1] ".//dataTable[1]/physical/size" #> [2] ".//dataTable[2]/physical/size" #> [3] ".//dataTable[1]/physical/authentication" #> [4] ".//dataTable[2]/physical/authentication" #> [5] ".//dataTable[1]/attributeList" #> [6] ".//dataTable[1]/numberOfRecords" #> [7] ".//dataTable[2]/numberOfRecords"