There is a newer version of this record available.

Dataset Open Access

# Ooh Na Na

Charles Tapley Hoyt

This is a gzipped four-column TSV file that has hashes, prefixes, identifiers, and names for lots of biomedical entities, drawing from the OBO Foundry, ontologies in the Ontology Lookup Service, and many other nomenclature consortia that just haven't made it to the prime-time of standardized goodness. Ultimate, this dataset helps answer the question: what's my name? The hash is the hex digest of the MD5 hash of the CURIE in the form of <prefix>:<identifier>.

It's really a lot of work to get this stuff, so I tried to make it easy. It was generated with the following code in the shell:

pip install git+https://github.com/pyobo/pyobo.git
obo database names

The 1.2.0 version includes the addition of PubChem, ChEBML, and NPASS
Files (4.7 GB)
Name Size
names.tsv.gz
md5:e3c018182942f190c67627002b19b8b8
4.7 GB
names_sample.tsv
md5:e22f1bf8747ee576ca32083eb8910d85
623 Bytes
names_summary.tsv
md5:e4c0304b9252c23bc1259599c06219fb
1.3 kB
184
51
views