TaxonWizard

The TaxonWizard has the task to build, update and extend the species list. It also monitors the list for errors.

Administations lists

In order for the TaxonWizard to build the species list, some administration lists are needed. The final species list is generated on the basis of these lists.

The following shows the status of the TaxonWizards admin lists.

taxonwizard_status

TaxBase

Structure: This list contains the base set of Aphia Ids for the final species list.
Mandatory fields: AphiaId
Meaning: It represents the minimum set of taxa for the final species list.

TaxAaid (expert knowledge)

Structure: In this list, Aphia Ids are mapped to divergent Accepted Aphia Ids.
Mandatory fields: aid (Aphia Id), aaid (Accepted Aphia Id).
Meaning: The taxonomy of WoRMS is adjusted with respect to accepted taxa according to own expert opinion.

TaxPrivate (expert opinion)

Structure: In this list, alternative taxon names are mapped to Aphia Ids.
Mandatory fields: private_name (alternative name), aid (Aphia Id)
Purpose: Unofficial taxon names, which are e.g. in historical datasets, are mapped to Aphia Ids. Thus, these data can also be brought into the front-end search algorithm of the species list. It is thus not only searched directly by name in the species list, but also additionally in this list.

TaxColony (expert knowledge)

Structure/meaning: This list contains Aphia Ids of taxa declared as colony living organisms.
Mandatory fields: aid (Aphia Id).

TaxChange (expert knowledge)

Structure: Different Aphia Ids are assigned to the following data:
Mandatory fields: aid (Aphia Id), treated (group name)
Background: Over time, taxa may be more accurately reclassified or reclassified. For example, it may be that the taxon with Aphia Id x was measured at time 0 at location X. At time 1, a taxon with aphia Id y is determined, but due to a reclassification, the taxon x measured at location X is a mixture of taxon x and y. This reclassification is caught by grouping the Aphia Ids x and y into a taxon complex.

TaxOut

Structure: ScientificNames are mapped to external databases. Mandatory fields: scientificname (taxon name) Background: There are taxa in the database that are not in WoRMS but should remain in the database. In order not to fall out of the grid they need an artificial Aphia Id (e.g. Aphia Id: -2, Scientificname: Variable, or can also be found in the TaxOut list). Thus they can be introduced into the system and can be connected via search in the TaxOut list with corresponding external databases. Alternatively a WoRMS compatible manual species list entry could be generated here to be compatible for evaluations to be compatible.

Species list

The species list generated from the administration lists consists per taxon of a subset of all attributes provided by WoRMS plus the following additional information. The Accepted Aphia Id has either been determined via WoRMS or has been added using the admin list TaxAaid. It is added to the taxon list for simplicity. The same applies to the Accepted Scientificname. The entry in the is_colony column is taken from the from the administration table TaxColony.
Below a generated species list is shown.

taxonwizard_final_list

Building Species List

In CRITTERBASE the species list is considered with great care. The user cannot directly modify or create the taxon list. Only the TaxonWizard is able to do this. The user has only two possibilities to send taxa directly to the TaxonWizard to build up the species list.

  1. Direct import of TaxonWizards base list
  2. Import via biota sheet while ingest process
Below you can see the import dialog window for TaxonWizards administration lists.

taxonwizard_import_taxa_base

Below you can see a base list.

taxonwizard_base_list

Below you can see the dialog window for ingestion of North Sea data.

taxonwizard_import_taxa_ingest

Update the species list and detecting problems

During the update or assembly process of a species list, the following quality tests are performed:

Ambiguous scientificname entries

Is there the same scientific name in admin list ... ... an error is reported.

Thus, it is prevented that during a named taxa search in the taxon system in the imposed expert knowledge per se - or in combination with the taxon list - ambiguous results are delivered. For example, there must be no scientific names in the alternative nomenclature of TaxPrivate which also exist in the species list; e.g. the name Abra alba must not exist as an alternative name in TaxPrivate if it also exists in species list.

Missing ScientificName or Accepted ScientificName

If gaps (no value entered) exist in the ScientificName or Accepted ScientificName fields in an entry t from taxon list a warning is issued. These gaps can occur subsequently due to changes in the WoRMS dataset, for example, because the Scientificname of t changes and is unfortunately deleted by WoRMS. Taxon t then changes status from accepted to unaccepted and refers to a new accepted taxon. Unfortunately, WoRMS deletes the corresponding name and does not leave it in the unaccepted t.

This results in the following problem:

At the time of importing a record with such a taxon entry t, its scientific name was in the species list. A backend search of this name in the taxon system resulted in the matching Aphia Id, which could be retrieved in the biota table. Now the search entry by name is blocked. It is then only possible to find out which taxon the user meant at the time of import by the explicit, optional additional entry of the scientific name in the biota table. The systemic search entry via the taxon system has thus been subsequently undermined.

Two possible solutions are discussed:

Either an additional admin table (e.g. TaxGapFix) could be created which solves this WoRMS-induced problem. Alternatively, after a named search in the species system fails, the entire dataset could be searched for optional taxon names (biota.given_taxon_name). At the moment we prefer the first way, because the system continues to search for this species in the taxon system independently of the actual measurement data (biota) - until the corresponding Aphia Id is found. This is then used to search in the measurement data.

Duplicate names in the ScientificName field

Checks if taxa with different Aphia Ids and the same scientific name exist. A search operation using the ScientificName would thus end up ambiguous. This case would be quite legitimate, since there may be, for example, different non-accepted taxa in an Aphia Id cascade that have the same scientific name. Appropriate information will still be created.

Duplicate names in the Accepted ScientificName field

It is checked if taxa with different Aphia Ids and the same Accepted ScientificName exist. Thus, a search operation using the Accepted ScientificName would also end up ambiguous. This case is basically an error and should not occur. The reason for this can apparently be an error in the WoRMS database. A corresponding warning is generated.

No solution: How this circumstance can be fixed is not yet clarified.

Taxa with misleading Accepted AphiaIds

A check is made to see if there are totally unaccepted taxon entries (entries that do not result in an accepted entry via the Aphia Id cascade) in species list.

For details see the misfit section in chapter MyWoRMS.

Taxa with other configuration problems

All taxa are checked for correct configuration of AphiaIds, names and status.

For details see the misfit section in chapter MyWoRMS.

Taxa in Limbo

Furthermore, reference is made to taxa that are in a so-called limbo status.

For details see the misfit section in chapter MyWoRMS. The following is the result log of a taxon list update process.

taxonwizard_update