Published December 10, 2021 | Version v.1.0
Dataset Open

kalawinka/minack: 1

Creators

Description

This dataset contains the results of the DZHW-funded project Mining Acknowledgement Texts in Web of Science (MinAck). Please read readme.txt to know more about the files stored in this repository.

The aim of the MinAck project is to conduct an analysis of acknowledgment texts from the Web of Science (WoS) using the FLAIR NLP Framework. The FLAIR NLP Framework is used to perform the acknowledged entity recognition task. Our NER model is able to recognize 6 entity types: funding agencies (FUND), corporations (COR), universities (UNI), individuals (IND), grant numbers (GRNB) and miscellaneous (MISC).

The NER model was trained with the dataset containing over 600 annotated sentences from acknowledgement texts, written in scientific articles stored in WoS. Afterwards, the model was applied to analyse a corpus of approx. 200,000 acknowledgement texts from WoS. The data chosen for the present study was restricted by year and scientific discipline. Records from four different scientific disciplines published from 2014 to 2019 were considered: two disciplines from the social sciences (sociology and economics) and oceanography and computer science for comparison. Additionally, only WoS records types “article” and “review”, published in a scientific journal in English were selected. 

 

Files

dev_big.txt

Files (511.6 MB)

Name Size Download all
md5:f12e3246012415c38aba794272a1f9d7
36.6 MB Download
md5:932c3485cdea058b25037abd528bcde8
58.4 kB Preview Download
md5:3333fe1e55f1ff58484914b614ccc875
49.0 kB Preview Download
md5:9983c7e8035781a056ab11d96d36013c
372.3 kB Preview Download
md5:1db80214ae21ef6119112798e4d624eb
3.3 kB Preview Download
md5:c58cd31c9c432caf8509d8ce5d138de3
432.2 MB Download
md5:8c5025237df7837d04fdfaa5432dc412
17.5 MB Download
md5:40e170a300fa44768213a5760b778268
12.3 MB Download
md5:58da5f9293b60c4c2ef3c4927650ad2d
6.1 MB Download
md5:89cb929da239c0f21d5a7819c29923f7
3.5 kB Preview Download
md5:36e84599be16d9a02645b2544bbc3c0f
68.0 kB Preview Download
md5:489a4bcc586a045cab5c379011d923a6
57.0 kB Preview Download
md5:bc4d318fa60b9c8f46b1db8624145261
133.8 kB Preview Download
md5:0942ec9249be4ea6e3a23430e981fc50
111.9 kB Preview Download
md5:cc7bd7d17228da66849408d0ad3fd1b7
6.1 MB Download

Additional details

Related works