Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter

Martin Müller; Marcel Salathé; Per E Kummervold

doi:10.5281/zenodo.7074654

Published September 13, 2022 | Version Preprint

Journal article Open

Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter

1. Digital Epidemiology Lab, EPFL
2. FISABIO-Public HealthIn this work, we release COVID-Twitter-BERT (CT-BERT), a transformer-based model, pretrained on a large corpus of Twitter messages on the topic of COVID-19. Our model shows a 10–30% marginal improvement compared to its base model, BERT-LARGE, on five different classification datasets. The largest improvements are on the target domain. Pretrained transformer models, such as CT-BERT, are trained on a specific target domain and can be used for a wide variety of natural language processing tasks, including classification, question-answering and chatbots. CT-BERT is optimised to be used on COVID-19 content, in particular from social media

In this work, we release COVID-Twitter-BERT (CT-BERT), a transformer-based model, pretrained on a large corpus of Twitter messages on the topic of COVID-19. Our model shows a 10–30% marginal improvement compared to its base model, BERT-LARGE, on five different classification datasets. The largest improvements are on the target domain. Pretrained transformer models, such as CT-BERT, are trained on a specific target domain and can be used for a wide variety of natural language processing tasks, including classification, question-answering and chatbots. CT-BERT is optimised to be used on COVID-19 content, in particular from social media

Files

2005.07503.pdf

Files (243.2 kB)

Name	Size	Download all
2005.07503.pdf md5:488263eef6bad223834dc12743ba2572	243.2 kB	Preview Download

Additional details

European Commission
VACMA - Vaccine Media Analytics 797876

	All versions	This version
Views	103	103
Downloads	175	175
Data volume	43.5 MB	43.5 MB

Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter

Authors/Creators

Description

Files

2005.07503.pdf

Files (243.2 kB)

Additional details

Funding