Dataset Open Access

Past Written Texts Dataset

John Ellul; Marina Polycarpou

Data curator(s)
Evangelia I. Zacharaki

The dataset consists of features extracted from older adults’ text.

The texts were written by the older person either in an electronic mean (eg. older e-mail), or in paper form and were transcribed by the project's clinical nurses.

The texts were then translated to English using the MyMemory service (https://mymemory.translated.net/), and a series of features were generated that can be used for sentiment analysis.

The list of fields of this dataset is presented below:

- Part_id: The user ID, which should be a 4-digit number

- Date: The recording date, which follows the “DD-MM-YY” format (eg. 14 September 2017, is formatted as 14-09-17)

- Clinical_visit: As several clinical evaluations were performed to each older adult, this number shows for which clinical evaluation these measurements refer to

- Transcript: If the text was written by the older adult (0) or was transcribed by a nurse (1)

- Language: The original language of the text (0 = Greek)

- Text_length, Number_of_sentences, Number_of_words, Number_of_words_per_sentence, Text_entropy: Statistical Measures

- Desc_image_ENG_sentiment, Desc_event_sentiment, Prev_text_ENG_sentiment: Sentiment Analysis

- Tf-XX: Term frequency – Inverse document frequency

- Tf-pos-XX: Part of Speech analysis, using tf-idf methodology

Files (2.9 kB)
Name Size
Social Media Sensing Texts.csv
md5:a793e34e65c4664a72b09d2031e0b3b0
2.9 kB Download
70
44
views
downloads
All versions This version
Views 7070
Downloads 4444
Data volume 127.4 kB127.4 kB
Unique views 5555
Unique downloads 3636

Share

Cite as