Dataset Open Access

Drug Indications Extracted from FAERS

Stupp, Gregory S; Su, Andrew I


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">faers, drug indications, fda, drugs</subfield>
  </datafield>
  <controlfield tag="005">20200124192508.0</controlfield>
  <controlfield tag="001">1436000</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">The Scripps Research Institute</subfield>
    <subfield code="0">(orcid)0000-0002-9859-4104</subfield>
    <subfield code="a">Su, Andrew I</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">242342</subfield>
    <subfield code="z">md5:4c9d12fcbd78257622a8be91504cd741</subfield>
    <subfield code="u">https://zenodo.org/record/1436000/files/faers_indications.csv</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2018-09-28</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o">oai:zenodo.org:1436000</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">The Scripps Research Institute</subfield>
    <subfield code="0">(orcid)0000-0002-0644-7212</subfield>
    <subfield code="a">Stupp, Gregory S</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Drug Indications Extracted from FAERS</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">2R01GM089820-06</subfield>
    <subfield code="a">Gene Wiki: expanding the ecosystem of community-intelligence resources</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/publicdomain/zero/1.0/legalcode</subfield>
    <subfield code="a">Creative Commons Zero v1.0 Universal</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;This dataset contains drug indications extracted from the&amp;nbsp;FDA Adverse Event Reporting System (&lt;a href="https://www.fda.gov/drugs/guidancecomplianceregulatoryinformation/surveillance/adversedrugeffects/"&gt;FAERS&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Source code here:&amp;nbsp;&lt;a href="https://github.com/stuppie/faers"&gt;https://github.com/stuppie/faers&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Method Outline&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Data files are extracted from zip files, parsed from csvs, and imported into a MySQL database (see&amp;nbsp;parser.py).&lt;/li&gt;
	&lt;li&gt;Duplicate records are then de-duplicated by taking only the most recent version for each case ID (see&amp;nbsp;dedupe.py).&lt;/li&gt;
	&lt;li&gt;Indications are normalized by matching to UMLS terms by string matching. Cross-references to Human Phenotype Ontology are pulled from UMLS and xrefs to Monarch Disease Ontology (MONDO) are pulled from MONDO using the UMLS xrefs. (See&amp;nbsp;normalize_indications.py)&lt;/li&gt;
	&lt;li&gt;Drugs names are normalized first by applying a few simple string cleaning operations (strip, fix slashes and periods). Then they are attempted to be matched to rxnorm by exact string matching. Those that don&amp;#39;t match are run against rxnorm&amp;#39;s &lt;a href="https://rxnav.nlm.nih.gov/RxNormAPIs.html#uLink=RxNorm_REST_getApproximateMatch"&gt;approximate matching service&lt;/a&gt;, and are accepted if the score is higher than&amp;nbsp;67/100. The matched rxnorm CUIs are then mapped to the their Ingredient level rxnorm ID. (See&amp;nbsp;normalize_drugs.py)&lt;/li&gt;
	&lt;li&gt;Indications are then retrieved for each drug ingredient and filtered to require a minimum of 20 individual occurances. (See&amp;nbsp;get_indications.py)&lt;/li&gt;
&lt;/ul&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.1435999</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.1436000</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
946
63
views
downloads
All versions This version
Views 946946
Downloads 6363
Data volume 15.3 MB15.3 MB
Unique views 883883
Unique downloads 4444

Share

Cite as