1134693
doi
10.5281/zenodo.1134693
oai:zenodo.org:1134693
Altman, Russ B.
Stanford University
A global network of biomedical relationships derived from text
Percha, Bethany
Icahn School of Medicine at Mount Sinai
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
natural language processing
Medline
text mining
relation extraction
unsupervised learning
<p>This repository contains labeled, weighted networks of chemical-gene, gene-gene, gene-disease, and chemical-disease relationships based on single sentences in PubMed abstracts. All raw dependency paths are provided in addition to the labeled relationships.</p>
<p>PART I: Connects dependency paths to labels, or "themes". Each record contains a dependency path followed by its score for each theme, and indicators of whether or not the path is part of the flagship path set for each theme (meaning that it was manually reviewed and determined to reflect that theme). The themes themselves are listed below and are in our paper (reference below).</p>
<p>PART II: Connects sentences to dependency paths. It consists of sentences and associated metadata, entity pairs found in the sentences, and dependency paths connecting those entity pairs. Each record contains the following information:</p>
<ul>
<li>PubMed ID</li>
<li>Sentence number (0 = title)</li>
<li>First entity name, formatted</li>
<li>First entity name, location (characters from start of abstract)</li>
<li>Second entity name, formatted</li>
<li>Second entity name, location</li>
<li>First entity name, raw string</li>
<li>Second entity name, raw string</li>
<li>First entity name, database ID(s)</li>
<li>Second entity name, database ID(s)</li>
<li>First entity type (Chemical, Gene, Disease)</li>
<li>Second entity type (Chemical, Gene, Disease)</li>
<li>Dependency path</li>
<li>Sentence, tokenized</li>
</ul>
<p>The "with-themes.txt" files only contain dependency paths with corresponding theme assignments from Part I. The plain ".txt" files contain all dependency paths.</p>
<p>This release contains the annotated network for the <strong>November 13, 2017 version of PubTator</strong>. The version discussed in our paper, below, is an older one - from April 30, 2016. If you're interested in that network, it can be found in Version 1 of this repository. We will be releasing updated networks periodically, as the PubTator community continues to release new versions of named entity annotations for Medline each month or so.</p>
<p>------------------------------------------------------------------------------------<br>
REFERENCES</p>
<p>Percha B, Altman RBA (2017) A global network of biomedical relationships derived from text. (Submitted to <em>Bioinformatics</em>; currently in revision.)<br>
Percha B, Altman RBA (2015) Learning the structure of biomedical relationships from unstructured text. <em>PLoS Computational Biology,</em> 11(7): e1004216.</p>
<p>This project depends on named entity annotations from the PubTator project:<br>
https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/PubTator/</p>
<p>Reference:<br>
Wei CH et. al., PubTator: a Web-based text mining tool for assisting Biocuration, Nucleic acids research, 2013, 41 (W1): W518-W522. doi: 10.1093/nar/gkt44</p>
<p>Dependency parsing was provided by the Stanford CoreNLP toolkit:<br>
https://stanfordnlp.github.io/CoreNLP/index.html</p>
<p>Reference:<br>
Manning, Christopher D., Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J. Bethard, and David McClosky. 2014. The Stanford CoreNLP Natural Language Processing Toolkit In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55-60.</p>
<p>------------------------------------------------------------------------------------<br>
THEMES</p>
<p><strong>chemical-gene</strong><br>
(A+) agonism, activation<br>
(A-) antagonism, blocking<br>
(B) binding, ligand (esp. receptors)<br>
(E+) increases expression/production<br>
(E-) decreases expression/production<br>
(E) affects expression/production (neutral)<br>
(N) inhibits</p>
<p><strong>gene-chemical</strong><br>
(O) transport, channels<br>
(K) metabolism, pharmacokinetics<br>
(Z) enzyme activity</p>
<p><strong>chemical-disease</strong><br>
(T) treatment/therapy (including investigatory)<br>
(C) inhibits cell growth (esp. cancers)<br>
(Sa) side effect/adverse event<br>
(Pr) prevents, suppresses<br>
(Pa) alleviates, reduces<br>
(J) role in disease pathogenesis</p>
<p><strong>disease-chemical</strong><br>
(Mp) biomarkers (of disease progression)</p>
<p><strong>gene-disease</strong><br>
(U) causal mutations<br>
(Ud) mutations affecting disease course<br>
(D) drug targets<br>
(J) role in pathogenesis<br>
(Te) possible therapeutic effect<br>
(Y) polymorphisms alter risk<br>
(G) promotes progression</p>
<p><strong>disease-gene</strong><br>
(Md) biomarkers (diagnostic)<br>
(X) overexpression in disease<br>
(L) improper regulation linked to disease</p>
<p><strong>gene-gene</strong><br>
(B) binding, ligand (esp. receptors)<br>
(W) enhances response<br>
(V+) activates, stimulates<br>
(E+) increases expression/production<br>
(E) affects expression/production (neutral)<br>
(I) signaling pathway<br>
(H) same protein or complex<br>
(Rg) regulation<br>
(Q) production by cell population</p>
Zenodo
2018-01-09
info:eu-repo/semantics/other
1035252
1579893956.955567
6128099520
md5:7a64e65bf4c8a649fe3b5b683230dcfd
https://zenodo.org/records/1134693/files/part-ii-dependency-paths-chemical-disease-sorted.txt
419345650
md5:d3785afb6e7c7b299a72d17e4cdaeabc
https://zenodo.org/records/1134693/files/part-i-chemical-disease-path-theme-distributions.txt
1588172931
md5:6dd060636c07a0bdad618829a681ddd3
https://zenodo.org/records/1134693/files/part-ii-dependency-paths-gene-gene-sorted-with-themes.txt
12027228868
md5:6290f348a9e69996946fd791d83854c6
https://zenodo.org/records/1134693/files/part-ii-dependency-paths-gene-gene-sorted.txt
175849620
md5:dc3b5030dc97dc9e0eeec5831d4472ed
https://zenodo.org/records/1134693/files/part-i-chemical-gene-path-theme-distributions.txt
1338062178
md5:b106988548d752a3c1b736a69dcdf248
https://zenodo.org/records/1134693/files/part-ii-dependency-paths-gene-disease-sorted-with-themes.txt
435637994
md5:d3786e10653360f8302dca8265fdd612
https://zenodo.org/records/1134693/files/part-i-gene-disease-path-theme-distributions.txt
4808161657
md5:7a7903ef58c8c8790e461d75612231f6
https://zenodo.org/records/1134693/files/part-ii-dependency-paths-gene-disease-sorted.txt
355704154
md5:c703b1c533f34a190ff562fe66b34398
https://zenodo.org/records/1134693/files/part-i-gene-gene-path-theme-distributions.txt
614102049
md5:89f88fcd75e8ca6b30c55adeae2696eb
https://zenodo.org/records/1134693/files/part-ii-dependency-paths-chemical-gene-sorted-with-themes.txt
3729806868
md5:d4455a28bf1a55cf36a299c3e427d9c2
https://zenodo.org/records/1134693/files/part-ii-dependency-paths-chemical-gene-sorted.txt
1681292666
md5:02d731b1250ac3a12546a3597aded2da
https://zenodo.org/records/1134693/files/part-ii-dependency-paths-chemical-disease-sorted-with-themes.txt
public
10.5281/zenodo.1035252
isVersionOf
doi