Published April 5, 2021 | Version v1
Dataset Open

A Tweet-based Dataset for Company-Level Stock Return Prediction

  • 1. Imperial College London

Description

Public opinion influences events, especially related to stock market movement, in which a subtle hint can influence the local outcome of the market. In this paper, we present a dataset that allows for company-level analysis of tweet based impact on one-, two-, three-, and seven-day stock returns. Our dataset consists of 862, 231 labelled instances from twitter in English, we also release a cleaned subset of 85, 176 labelled instances to the community. We also provide baselines using standard machine learning algorithms and a multi-view learning based approach that makes use of different types of features.

Notes

Contact: k.m.sowinska@gmail.com

Files

full_dataset-release.csv

Files (259.3 MB)

Name Size Download all
md5:2af0494e264d188cdfa3ef82087b41ce
236.0 MB Preview Download
md5:4cb423d3b63b27f62eed22c1634cf99d
23.3 MB Preview Download