Published August 10, 2025 | Version v1
Dataset Open

Hillary Clinton's Email-Reply Dataset

Authors/Creators

Description

This dataset contains paired email messages where the first is the email message and the second is the reply. It was created in 2023 as part of my Master’s thesis in Internet systems engineering at National University of Science and Technology POLITEHNICA Bucharest, with the goal of enabling research in Automated Generation of Email Reply. It was derived from the publicly available "Hillary Clinton's Emails" dataset by US State Department and found on Kaggle (https://www.kaggle.com/datasets/kaggle/hillary-clinton-emails/data?select=Emails.csv). 

The dataset consists of 941 english email-reply pairs in csv format.

Fields:

EmailSend – text of the original email sent
EmailReply – text of the response email
SubjectSend - subject of the email sent
SubjectReply - subject of the response email
From - the person who sent the original email
To - the person who received the initial email and replied with another email
Context - the previous lines of the conversation between the two people

Files

hillary-clinton-email-reply-dataset.zip

Files (145.9 kB)

Name Size Download all
md5:a99d87adf0fe0be7c8fe63812d75831a
145.9 kB Preview Download

Additional details