Dataset Open Access

Past Written Texts Dataset

John Ellul; Marina Polycarpou


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:contributor>Evangelia I. Zacharaki</dc:contributor>
  <dc:creator>John Ellul</dc:creator>
  <dc:creator>Marina Polycarpou</dc:creator>
  <dc:date>2019-05-07</dc:date>
  <dc:description>The dataset consists of features extracted from older adults’ text.

The texts were written by the older person either in an electronic mean (eg. older e-mail), or in paper form and were transcribed by the project's clinical nurses.

The texts were then translated to English using the MyMemory service (https://mymemory.translated.net/), and a series of features were generated that can be used for sentiment analysis.

The list of fields of this dataset is presented below:

- Part_id: The user ID, which should be a 4-digit number

- Date: The recording date, which follows the “DD-MM-YY” format (eg. 14 September 2017, is formatted as 14-09-17)

- Clinical_visit: As several clinical evaluations were performed to each older adult, this number shows for which clinical evaluation these measurements refer to

- Transcript: If the text was written by the older adult (0) or was transcribed by a nurse (1)

- Language: The original language of the text (0 = Greek)

- Text_length, Number_of_sentences, Number_of_words, Number_of_words_per_sentence, Text_entropy: Statistical Measures

- Desc_image_ENG_sentiment, Desc_event_sentiment, Prev_text_ENG_sentiment: Sentiment Analysis

- Tf-XX: Term frequency – Inverse document frequency

- Tf-pos-XX: Part of Speech analysis, using tf-idf methodology</dc:description>
  <dc:identifier>https://zenodo.org/record/2670061</dc:identifier>
  <dc:identifier>10.5281/zenodo.2670061</dc:identifier>
  <dc:identifier>oai:zenodo.org:2670061</dc:identifier>
  <dc:relation>info:eu-repo/grantAgreement/EC/H2020/690140/</dc:relation>
  <dc:relation>doi:10.5281/zenodo.2670060</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>http://creativecommons.org/licenses/by/4.0/legalcode</dc:rights>
  <dc:subject>social media sensing</dc:subject>
  <dc:subject>sentiment analysis</dc:subject>
  <dc:subject>text-based sentiment analysis</dc:subject>
  <dc:title>Past Written Texts Dataset</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>
65
38
views
downloads
All versions This version
Views 6565
Downloads 3838
Data volume 110.0 kB110.0 kB
Unique views 5050
Unique downloads 3131

Share

Cite as