Dataset Open Access

Past Written Texts Dataset

John Ellul; Marina Polycarpou

Data curator(s)
Evangelia I. Zacharaki

The dataset consists of features extracted from older adults’ text.

The texts were written by the older person either in an electronic mean (eg. older e-mail), or in paper form and were transcribed by the project's clinical nurses.

The texts were then translated to English using the MyMemory service (https://mymemory.translated.net/), and a series of features were generated that can be used for sentiment analysis.

The list of fields of this dataset is presented below:

- Part_id: The user ID, which should be a 4-digit number

- Date: The recording date, which follows the “DD-MM-YY” format (eg. 14 September 2017, is formatted as 14-09-17)

- Clinical_visit: As several clinical evaluations were performed to each older adult, this number shows for which clinical evaluation these measurements refer to

- Transcript: If the text was written by the older adult (0) or was transcribed by a nurse (1)

- Language: The original language of the text (0 = Greek)

- Text_length, Number_of_sentences, Number_of_words, Number_of_words_per_sentence, Text_entropy: Statistical Measures

- Desc_image_ENG_sentiment, Desc_event_sentiment, Prev_text_ENG_sentiment: Sentiment Analysis

- Tf-XX: Term frequency – Inverse document frequency

- Tf-pos-XX: Part of Speech analysis, using tf-idf methodology

Files (2.9 kB)
Name Size
Social Media Sensing Texts.csv
md5:a793e34e65c4664a72b09d2031e0b3b0
2.9 kB Download
46
20
views
downloads
All versions This version
Views 4646
Downloads 2020
Data volume 57.9 kB57.9 kB
Unique views 3333
Unique downloads 1515

Share

Cite as