1479354
doi
10.5281/zenodo.1479354
oai:zenodo.org:1479354
RodrÃguez-Vidal, Javier
UNED
Plaza, Laura
UNED
eDiseases Dataset
Carrillo-de-Albornoz, Jorge
UNED
info:eu-repo/semantics/openAccess
Creative Commons Attribution Non Commercial 4.0 International
https://creativecommons.org/licenses/by-nc/4.0/legalcode
Polarity
Factuality
Health
Social Networks
Patient data
<p>The eDiseases dataset contains patient data from the MedHelp health site (http://www.medhelp.org/), where different communities share information and opinions about diseases. Each community consists of a number of conversations; a conversation being a sequence of comments posted by patients.</p>
<p>To build the dataset, we automatically extracted 10 conversations from each of the following three communities: allergies, crohn and breast cancer. We selected a set of diseases that, according to medical expert, show high heterogeneity concerning both the degree of medical understanding of the diseases and the profile of the users. The conversations were selected randomly, but we automatically filtered out conversations with less than 10 posts. In total, we extracted 146 posts for allergies, 191 posts for crohn, and 142 posts for breast cancer; which include 983 sentences for allergies, 1780 sentences for crohn, and 1029 sentences for breast cancer, covering a 6 years time interval. Three frequent users of health forums annotated each sentence in the dataset as:</p>
<p>Factuality: OPINION, FACT, EXPERIENCE.<br>
Polarity: POSITIVE, NEUTRAL, NEGATIVE.</p>
<p>In case of doubt, the annotators labeled the sentence as NOT_LABELED. As a result, we collected 967 labeled sentences for allergies, 1,709 labeled sentences 294 for crohn, and 959 labeled sentences for breast cancer.</p>
Zenodo
2018-11-07
info:eu-repo/semantics/other
1479353
2.0
1579893945.940353
105459
md5:29ea0e06b236a4c75e5deb451e4c6aaf
https://zenodo.org/records/1479354/files/eDiseases-dataset-V2.0.rar
public
10.5281/zenodo.1479353
isVersionOf
doi