Poster Open Access
Sara Lafia;
Elizabeth Moss;
Andrea Thomer;
Libby Hemphill
<?xml version='1.0' encoding='UTF-8'?> <record xmlns="http://www.loc.gov/MARC21/slim"> <leader>00000nam##2200000uu#4500</leader> <datafield tag="041" ind1=" " ind2=" "> <subfield code="a">eng</subfield> </datafield> <datafield tag="653" ind1=" " ind2=" "> <subfield code="a">data archive</subfield> </datafield> <datafield tag="653" ind1=" " ind2=" "> <subfield code="a">data citation</subfield> </datafield> <datafield tag="653" ind1=" " ind2=" "> <subfield code="a">data discovery</subfield> </datafield> <datafield tag="653" ind1=" " ind2=" "> <subfield code="a">natural language processing</subfield> </datafield> <datafield tag="653" ind1=" " ind2=" "> <subfield code="a">scholarly communication</subfield> </datafield> <controlfield tag="005">20211203014829.0</controlfield> <controlfield tag="001">5748382</controlfield> <datafield tag="700" ind1=" " ind2=" "> <subfield code="u">University of Michigan, USA</subfield> <subfield code="0">(orcid)0000-0001-5464-8716</subfield> <subfield code="a">Elizabeth Moss</subfield> </datafield> <datafield tag="700" ind1=" " ind2=" "> <subfield code="u">University of Michigan, USA</subfield> <subfield code="0">(orcid)0000-0001-6238-3498</subfield> <subfield code="a">Andrea Thomer</subfield> </datafield> <datafield tag="700" ind1=" " ind2=" "> <subfield code="u">University of Michigan, USA</subfield> <subfield code="0">(orcid)0000-0002-3793-7281</subfield> <subfield code="a">Libby Hemphill</subfield> </datafield> <datafield tag="856" ind1="4" ind2=" "> <subfield code="s">956706</subfield> <subfield code="z">md5:aaf439007aaccb73ff1d202725191003</subfield> <subfield code="u">https://zenodo.org/record/5748382/files/FORCE-11-poster-2021.pdf</subfield> </datafield> <datafield tag="542" ind1=" " ind2=" "> <subfield code="l">open</subfield> </datafield> <datafield tag="260" ind1=" " ind2=" "> <subfield code="c">2021-12-01</subfield> </datafield> <datafield tag="909" ind1="C" ind2="O"> <subfield code="p">openaire</subfield> <subfield code="p">user-force2021</subfield> <subfield code="o">oai:zenodo.org:5748382</subfield> </datafield> <datafield tag="100" ind1=" " ind2=" "> <subfield code="u">University of Michigan, USA</subfield> <subfield code="0">(orcid)0000-0002-5896-7295</subfield> <subfield code="a">Sara Lafia</subfield> </datafield> <datafield tag="245" ind1=" " ind2=" "> <subfield code="a">Detecting Informal Data Use in Literature</subfield> </datafield> <datafield tag="980" ind1=" " ind2=" "> <subfield code="a">user-force2021</subfield> </datafield> <datafield tag="540" ind1=" " ind2=" "> <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield> <subfield code="a">Creative Commons Attribution 4.0 International</subfield> </datafield> <datafield tag="650" ind1="1" ind2="7"> <subfield code="a">cc-by</subfield> <subfield code="2">opendefinition.org</subfield> </datafield> <datafield tag="520" ind1=" " ind2=" "> <subfield code="a"><p>The Inter-university Consortium for Political and Social Research (ICPSR) is developing a computational approach to detect informal data use and construct reliable data impact metrics. Formal data citations that use unique identifiers are readily discoverable; however, informal references made to data are challenging to infer and detect as they are described in many ways and tend to occur in article footnotes, tables, figures, or elsewhere where they are not indexed for search. Identifying data citations is an essential step toward characterizing the impact of research data (i.e., who reuses research data and for what purposes). We use features of text including the presence of indicator terms, sections of articles, and frequency of acronyms, to predict the portions of articles that are likely to indicate data use. We then use a natural language processing (NLP) pipeline to extract candidate data references. In production, our model will support the review of publications to ingest into the ICPSR Bibliography of Data-related Literature as part of a broader effort to measure the impact of research data.</p></subfield> </datafield> <datafield tag="773" ind1=" " ind2=" "> <subfield code="n">doi</subfield> <subfield code="i">isVersionOf</subfield> <subfield code="a">10.5281/zenodo.5748381</subfield> </datafield> <datafield tag="024" ind1=" " ind2=" "> <subfield code="a">10.5281/zenodo.5748382</subfield> <subfield code="2">doi</subfield> </datafield> <datafield tag="980" ind1=" " ind2=" "> <subfield code="a">poster</subfield> </datafield> </record>
All versions | This version | |
---|---|---|
Views | 104 | 104 |
Downloads | 54 | 54 |
Data volume | 51.7 MB | 51.7 MB |
Unique views | 91 | 91 |
Unique downloads | 51 | 51 |