There is a newer version of this record available.

Dataset Open Access

Ooh Na Na

Charles Tapley Hoyt

This is a gzipped four-column TSV file that has hashes, prefixes, identifiers, and names for lots of biomedical entities, drawing from the OBO Foundry, ontologies in the Ontology Lookup Service, and many other nomenclature consortia that just haven't made it to the prime-time of standardized goodness. Ultimate, this dataset helps answer the question: what's my name? The hash is the hex digest of the MD5 hash of the CURIE in the form of <prefix>:<identifier>.

It's really a lot of work to get this stuff, so I tried to make it easy. It was generated with the following code in the shell:

pip install git+https://github.com/pyobo/pyobo.git
obo database names

The 1.2.0 version includes the addition of PubChem, ChEBML, and NPASS
Files (4.7 GB)
Name Size
names.tsv.gz
md5:e3c018182942f190c67627002b19b8b8
4.7 GB Download
names_sample.tsv
md5:e22f1bf8747ee576ca32083eb8910d85
623 Bytes Download
names_summary.tsv
md5:e4c0304b9252c23bc1259599c06219fb
1.3 kB Download
184
51
views
downloads
All versions This version
Views 18424
Downloads 5115
Data volume 27.0 GB18.8 GB
Unique views 14320
Unique downloads 379

Share

Cite as