Dataset Open Access

l-sized Training and Evaluation Data for Publication "Using Supervised Learning to Classify Metadata of Research Data by Field of Study"

Tobias Weber

Automated classification of metadata of research data by their discipline(s) of research can be used in scientometric research, by repository service providers, and in the context of research data aggregation services. Openly available metadata of the DataCite index for research data were used to compile a large training and evaluation set comprised of 609,524 records. This is the cleaned and vectorized version with a feature selection of large size.

Files (2.0 GB)
Name Size
l_data_vectorized.tar.gz
md5:2d0dacc2e0902b6ca69e1b997fb6da51
2.0 GB Download
70
24
views
downloads
All versions This version
Views 7070
Downloads 2424
Data volume 48.1 GB48.1 GB
Unique views 6464
Unique downloads 1717

Share

Cite as