There is a newer version of the record available.

Published May 14, 2017 | Version v1
Dataset Open

Hansard Speeches and Sentiment V2.0

Creators

Description

A public dataset of speeches in the Hansard. The dataset provides information on every speech made in the House of Commons between the parliament returned from the 1979 general election and the dissolution of parliament for the 2017 general election, with information on the speaking MP, their party, gender and age at the time of the speech. The dataset also includes all speeches made from 1936 to the dissolution of parliament for the 1979 general election. The post-1979 election dataset is labelled 'senti_post_v2' and the pre-1979 election dataset is labelled 'senti_pre_v2'.

The 'senti_post_v2' dataset contains 2,234,229 speeches and 404,589,163 words. The 'senti_pre_vs' dataset contains 2,977,498 speeches and 413,046,298 words. For more details see: http://evanodell.com/datasets/hansard-data/

Notes

This release is an update of previously released datasets. The new release includes improved consistency in sentiment calculations, with five different libraries and the same methods of calculation used for each library and corrects several misidentified speeches. It also includes all speeches up to the dissolution of parliament for the 2017 General Election.

Files

senti_post_v2.csv

Files (6.3 GB)

Name Size Download all
md5:8edca9d5b8e153236e5096dd500721b6
3.3 GB Preview Download
md5:2a686510c9e40295a10bc33e8c599edf
3.1 GB Preview Download

Additional details

Related works