Poster Open Access

Leveraging Open Access publishing to fight fake news

Sylvain Massip; Charles Letaillieur

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Open Access</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Text-mining</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Fake News</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Fact-checking</subfield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Word2Vec</subfield>
  <controlfield tag="005">20200430082022.0</controlfield>
  <controlfield tag="001">3776797</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">11-12 March 2020</subfield>
    <subfield code="a">Open Science Conference 2020</subfield>
    <subfield code="c">Berlin, Germany</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Opscidia</subfield>
    <subfield code="a">Charles Letaillieur</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">331692</subfield>
    <subfield code="z">md5:af071b8dfa807b6598798a60ab090ee2</subfield>
    <subfield code="u"></subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">104153</subfield>
    <subfield code="z">md5:770433da2aa7974cd4595f9fdb583f83</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="y">Conference website</subfield>
    <subfield code="u"></subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-04-30</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-osc2020</subfield>
    <subfield code="o"></subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Opscidia</subfield>
    <subfield code="a">Sylvain Massip</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Leveraging Open Access publishing to fight fake news</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-osc2020</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Since the very first experiences in Open Access publishing at the end of 20 th century,&lt;br&gt;
(arXiv and PLOS, two pioneers of open access distribution of academic articles were&lt;br&gt;
created in 1991 and 2001, respectively), Open Access has developed tremendously.&lt;/p&gt;

&lt;p&gt;Today, a significant fraction of research is published open access. Evaluation estimates&lt;br&gt;
it to be as high as 28% [Piwowar, 2018] and it occupies an ever-growing position in the&lt;br&gt;
scientific debate with the adoption, in 2018 of the plan S which creates an European&lt;br&gt;
level mandate for Open Access.&lt;/p&gt;

&lt;p&gt;In addition to being ethically desirable per se, there are many academic, economic and&lt;br&gt;
societal arguments in favor of open access. These arguments, based on an improvement&lt;br&gt;
of the exploitation and reuse of research results, are well described theoretically in the&lt;br&gt;
litterature [Tennant, 2017]. Nevertheless, the practical demonstration of the use of Open&lt;br&gt;
Access outside research communities are not common, and we have not many reports of&lt;br&gt;
these. The objective of our project is to illustrate the possible uses of Open Access&lt;br&gt;
outside of academia.&lt;/p&gt;

&lt;p&gt;In this study, we will examine how open access combined with the right machine&lt;br&gt;
learning tools can help fight fake news.&lt;/p&gt;

&lt;p&gt;Natural Language processing has been revolutionized these last years, by the use of&lt;br&gt;
neural networks based language models such as word2Vec [Mikolov, 2013] and Bert&lt;br&gt;
[Devlin, 2018].&lt;/p&gt;

&lt;p&gt;By building space representation of the words and concepts used in texts, these models&lt;br&gt;
are able to take into account the meanings of studied texts. These methods have been&lt;br&gt;
shown to be of use to create knowledge bases from corpus of texts [Petroni, 2019] in a&lt;br&gt;
unsupervised manner. More specifically, [Tshitoyan, 2019] has shown that these&lt;br&gt;
methods, applied to a scientific corpus in an unsupervised manner, were able to retrieve&lt;br&gt;
the links between concepts that exists in the texts.&lt;/p&gt;

&lt;p&gt;This study will investigate how these principles will be used to build a text-mining&lt;br&gt;
pipeline that indicates whether a scientific claim is backed by the scientific literature or&lt;br&gt;

&lt;p&gt;In this exploratory phase, the following methods will be applied:&lt;/p&gt;

	&lt;li&gt;data from Euro Pubmed Central database will be used to train a Word2Vec model.&lt;/li&gt;
	&lt;li&gt;claims will be restricted to health-related questions of the pattern &amp;ldquo;Does X cure/cause/prevent Y?&amp;rdquo;.&lt;/li&gt;
	&lt;li&gt;Claims will then be classified by exploring the links between X, Y and the concept of cure / cause / prevent as learned in the language model.&lt;/li&gt;

&lt;p&gt;The pipeline will be evaluated with claims taken from expert-based scientific&lt;br&gt;
fact-checking network such as or;/p&gt;

&lt;p&gt;By validating the principle of fact-checking scientific claims with Open Access&lt;br&gt;
literature, we hope to pave the way to improved automatic fact-checking tools, which&lt;br&gt;
will allow an increased understanding of research results by the broad public and to&lt;br&gt;
show a strong impact of open science in society.&lt;/p&gt;</subfield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.3776796</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.3776797</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">poster</subfield>
All versions This version
Views 379379
Downloads 7272
Data volume 21.8 MB21.8 MB
Unique views 356356
Unique downloads 6363


Cite as