Software Open Access
Paul J. Morris; David Lowery
Release version 2.0.2 of the FilteredPush event_date_qc library for testing data quality of date data. This library provides low level tests of date data and higher level tests for DarwinCore Event terms with those tests described with RDF metadata following the standard biodiversity data quality test descriptions by the TDWG Biodiversity Data Quality Interest Group Task Group 2 on tests and assertions.
This release includes implementation of all the core Time tests from TDWQ BDQ TG2 as of the end of August 2019, along with unit tests of both low level methods and higher level framework (matching the test specifications) methods. This release includes support for the lightweight run of tests from the command line.
Notes on this release: Adding unit test coverage. Added test for / without range (plus some other empty values) to unit test for DateUtils.extractInterval() Date range parsing fails for incorrectly formed ranges that include a / but no range. Adding handling for this case to code added to handle incorrect number of digits in ISO dates. Including output of time for million line processing in Runner. ISSUE: tdwg/bdq#76 Adding unit test for validation dateIdentified outofrange. Added unit test with notes concerning where the specification needs clarification (around overlapping ranges). Adding remaining unit tests. Filling in remaining unit tests for DwCEventTG2 methods, covering all TG2 temporal tests. ISSUE: tdwg/bdq#52 Unit test for AMENDMENT_EVENT_FROM_EVENTDATE. ISSUE: tdwg/bdq#86 tdwg/bdq#67 Added unit test for VALIDATION_EVENT_INCONSISTENT, updated handling of empty eventDate to match the specification, added more debug logging. Added unit test for AMENDMENTEVENTDATE_FROMVERBATIM covering multiple cases in current specification, primaraly assessing logic of result type rather than abilty to interpret dates. ISSUE: tdwg/bdq#88 Unit tests for validationEventEmpty. DESCRIPTION: Adding unit test, under interpretation that 'field needed to determine the eventDate' means any field in set, e.g. day alone is compliant. ISSUE: tdwg/bdq#36 Unit test and correction to validationEventdateOutofrange. Adding unit test for method, found error in handling of current date (invoking wrong DateUtils method to get interval for comparison), fixed. Adding more unit tests for DwCEventTG2DQ validations. ISSUE: tdwg/bdq#125 Adding unit test for VALIDATION_DAY_OUTOFRANGE and correcting errors found. Added unit test covering all outcomes in specification, found and fixed errors, particularly in inverted logic acting upon yearParsable. Moving implementation of remaining non-eventDate method from DwCEventTG2DQ. Moved implementation of amendmentDateidentifiedStandardized to static method in DwCOtherDateDQ. Finishing moving test from temporary location in DQTest to TG2DQTest. Moving DwCEventTG2DQ methods to static. changing methods to public static to follow kurator pattern, making follow on changes to Runner and tests. Moving test from temporary location in DQTest to TG2DQTest. ISSUE: tdwg/bdq#125 tdwg/bdq#127 tdwg/bdq#61 Multiple fixes and added unit tests resulting from Arthur Chapman's ongoing review of results of test runs. Set distinct guid for mechanism for DwCEventDateTG2DQ. Java to 1.8 and added commons-lang3 in pom for StringUtils. Moved implementation for dateIdentified into DwCOtherDateDQ. Fix for single digit days and months being recognized as valid ISO date, updates to unit tests in consequence. DateUtils.extractInterval() and extractDate() now returns null when given single digit day or month values. Fix for handling of date ranges with end date before start date. DateUtils.eventDateValid() now returns false on these. added unit tests for validationYearEmpty, amendmentDayStandardized, amendmentEventdateStandardized, validationDateidentifiedNotstandard. ISSUE: tdwg/bdq#52 tdwg/bdq#131 Bug fixes to tests and test harness following Arthur Chapman's iterative reviews of the results of test runs. Fixed a bug in Runner where non-null Event terms were being overwritten with nulls on invocation of amendmentEventFromEventDate. Fixed bugs in validationDateidentifiedNotstandard resulting from copy/paste errors and changed logic. Fixed some other bugs, added a unit test, results now appear sane. Prevent null pointer exception on rare case of null in eventDateInterval. Corrections from exchange of first results with Arthur Chapman, some pre/post amendment validation numbers are odd. isEmpty() not correctly implemented for current expectations of tests. Removing 'NULL' as valid empty value per the current TG2 definition, updating tests in consequence, reducing error reporting. Harden Runner to work against real data. Altering Runner to run tests against tab delimited rather than comma delimited files (tab usual Darwin Core archive content), and allow skipping lines and handling of broken lines when reading data for test. Adding lightweight run of tests from the command line. Adding a Runner class to carry out command line operations. Changing main class in pom from DateUtils to Runner. Adding invocations of tests in DwcEventTG2DQ to Runner, moving main method in DateUtils to a verbatim date extraction method, invoking this from Runner. Reducing log level of some Errors in DateUtils to reduce command line output verbosity. ISSUE: tdwg/bdq#36 Updating VALIDATION_EVENTDATE_OUTOFRANGE to current specification. Implementation using begin and end date parameters, without optional begin, as per current (still under discussion) phrasing of the specification. Using Interval to test range contained within rather than integer year values, matching notes in issue specifying that day counts. Split off no-parameter test (taking the guid) from parameterized test. Fix javadoc tag problems. Fixing problems with @return and @param values resulting from java 8 evaluation of structure of javadoc annotations in remarks. ISSUE: tdwg/bdq#147 tdwg/bdq#125Updating validations for dwc:day based on latest changes in issues. DESCRIPTION: Implementation of validation day notstandard to match amendment_day_standardized testing only dwc:day and checking for integer in range 1-31, and validation_day_inrange to use day month year and return internal prerequisites not met on month or year only when those values are required to evaluate a day value (29-31). Resolving more concerns for date tests and TG2 specifications. ISSUE: tdwg/bdq#84 tdwg/bdq#26 tdwg/bdq#130 tdwg/bdq#61 Updating tests to reflect updates in definitions and latest discussion in tdwg/bdq issues. Removing confirmed unused isMonthInRange() method. Updating VALIDATION_STARTDAYOFYEAR_OUTOFRANGE to match new specification. Switching from #141 to #84 to test year for valid range. Correcting handling of ambiguity in AMENDMENT_EVENTDATE_STANDARDIZED and AMENDMENT_DATEIDENTIFIED_STANDARDIZED to conform with specifications. Implement CORE TIME tests. ISSUE: tdwg/bdq/#140 tdwg/bdq/#69 tdwg/bdq/#76 tdwg/bdq/#147 tdwg/bdq/#131 tdwg/bdq/#88 tdwg/bdq/#67 tdwg/bdq/#33 tdwg/bdq/#66 tdwg/bdq/#36 tdwg/bdq/#126 tdwg/bdq/#130 tdwg/bdq/#131 tdwg/bdq/#49 tdwg/bdq/#141 tdwg/bdq/#84 tdwg/bdq/#26 tdwg/bdq/#127 tdwg/bdq/#52 tdwg/bdq/#86 tdwg/bdq/#93 tdwg/bdq/#132 tdwg/bdq/#61 tdwg/bdq/#128 PURPOSE: Implementations of the currently (2019Aug12) defined core tests related to time. DESCRIPTION: Java class, method signatures, and annotations generated with kurator-ffdq from csv extracted from issues by bdq_issue_to_csv, then implementations added for each of the tests. TODO comments indicate remaining issues raised by implementation. The DwCEventTG2DQ class is expected to get merged into the DwCEventDQ and DwCOtherDateDQ classes, but is committed in parallel here to view all the tests in one place for comparison. ISSUE: tdwg/bdq#151 Adding some example data data. Adding a reference set of various cases of values in dwc date related terms and example interpretation of those values. Adding support for some uncommon month abbreviations.