Journal article Open Access

Taxonomer: a relational data model for managing information relevant to taxonomic research

Pyle, Richard L.

Taxonomic research, as a field of biological sciences, is fundamentally an exercise in information management. Modern computer technology offers the potential for both streamlining the taxo- nomic process, and increasing its accuracy. Effective use of computer technology to successfully manage taxonomic information is predicated upon the implementation of data models that accommodate the diverse forms of information important to taxonomic researchers. Although sophisticated data models have been developed to manage some information relevant to taxo- nomic research (e.g., natural history specimen information; descriptive data relating to morpho- logical and molecular characters of specimens), similarly robust models for managing information about taxonomic names and how they are applied to taxonomic concepts, though they exist, have not attained widespread use and adoption.

Herein I describe portions of a relational data model developed to manage information relevant to taxonomic names and concepts. The core entities of the described portions of this model are Agents, References, and Assertions (along with their associated Protonyms). Agents (people and organizations) in this context refer primarily to taxonomic authorities. References are broadly defined as date-stamped information (usually, but not exclusively, in the form of a publication), as documented by the Agents who serve as the Reference authors. Assertions consist of basic elemental information about the treatment of taxonomic names by taxonomic authorities as documented in a particular Reference, and correspond to what many authors refer to as taxon “concepts”. Protonyms are a special subset (subtype) of Assertions, which constitute original descriptions of taxonomic names (serving to unite multiple assertions pertaining to the same taxonomic name), and include elements of botanical Protologues and Basionyms.

I also illustrate how these core entities can serve as a foundation for taxonomic names and concepts as integrated with other datasets, such as biological specimens and observations (and, by extension, geographic distributions and character matrices). The broadest data content source used to populate and test the data model is derived from a systematic revision of the reef-fish family Pomacanthidae (marine angelfishes). Additional datasets used to test the imple- mentation of the data model include specimen data from the Department of Natural Sciences, Bishop Museum; nomenclatural data from The Catalog of Fishes; and nomenclatural and bio- geographic data from two published taxonomic catalogs (insects and terrestrial mollusks in Hawai‘i).

An intuitive, feature-rich software application based on Microsoft Access® has also been devel- oped in conjunction with this data model, and will be the topic of a future article. 

Files (631.2 kB)
Name Size
1.pdf
md5:fe0dd87855d4cfe09bd76e9a84d3b9bd
631.2 kB Download
229
102
views
downloads
All versions This version
Views 229229
Downloads 102102
Data volume 64.4 MB64.4 MB
Unique views 218218
Unique downloads 9797

Share

Cite as