Dataset Open Access

Past Written Texts Dataset

John Ellul; Marina Polycarpou


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">social media sensing</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">sentiment analysis</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">text-based sentiment analysis</subfield>
  </datafield>
  <controlfield tag="005">20190513135808.0</controlfield>
  <controlfield tag="001">2670061</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Materia Group Cyprus</subfield>
    <subfield code="a">Marina Polycarpou</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Univerity of Patras</subfield>
    <subfield code="4">cur</subfield>
    <subfield code="a">Evangelia I. Zacharaki</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">2895</subfield>
    <subfield code="z">md5:a793e34e65c4664a72b09d2031e0b3b0</subfield>
    <subfield code="u">https://zenodo.org/record/2670061/files/Social Media Sensing Texts.csv</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2019-05-07</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o">oai:zenodo.org:2670061</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">University of Patras</subfield>
    <subfield code="a">John Ellul</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Past Written Texts Dataset</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">690140</subfield>
    <subfield code="a">Sensing and predictive treatment of frailty and associated co-morbidities using advanced personalized patient models and advanced interventions</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">http://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;The dataset consists of features extracted from older adults&amp;rsquo; text.&lt;/p&gt;

&lt;p&gt;The texts were written by the older person either in an electronic mean (eg. older e-mail), or in paper form and were transcribed by the project&amp;#39;s clinical nurses.&lt;/p&gt;

&lt;p&gt;The texts were then translated to English using the MyMemory service (https://mymemory.translated.net/), and a series of features were generated that can be used for sentiment analysis.&lt;/p&gt;

&lt;p&gt;The list of fields of this dataset is presented below:&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;Part_id&lt;/strong&gt;: The user ID, which should be a 4-digit number&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;Date&lt;/strong&gt;: The recording date, which follows the &amp;ldquo;DD-MM-YY&amp;rdquo; format (eg. 14 September 2017, is formatted as 14-09-17)&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;Clinical_visit&lt;/strong&gt;: As several clinical evaluations were performed to each older adult, this number shows for which clinical evaluation these measurements refer to&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;Transcript&lt;/strong&gt;: If the text was written by the older adult (0) or was transcribed by a nurse (1)&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;Language&lt;/strong&gt;: The original language of the text (0 = Greek)&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;Text_length, Number_of_sentences, Number_of_words, Number_of_words_per_sentence, Text_entropy&lt;/strong&gt;: Statistical Measures&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;Desc_image_ENG_sentiment, Desc_event_sentiment, Prev_text_ENG_sentiment&lt;/strong&gt;: Sentiment Analysis&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;Tf-XX&lt;/strong&gt;: Term frequency &amp;ndash; Inverse document frequency&lt;/p&gt;

&lt;p&gt;- &lt;strong&gt;Tf-pos-XX&lt;/strong&gt;: Part of Speech analysis, using tf-idf methodology&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.2670060</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.2670061</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
56
30
views
downloads
All versions This version
Views 5656
Downloads 3030
Data volume 86.8 kB86.8 kB
Unique views 4343
Unique downloads 2525

Share

Cite as