Dataset Open Access
<?xml version='1.0' encoding='utf-8'?> <resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd"> <identifier identifierType="DOI">10.5281/zenodo.3490460</identifier> <creators> <creator> <creatorName>Tobias Weber</creatorName> <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0003-1815-7041</nameIdentifier> <affiliation>Leibniz Supercomputing Centre</affiliation> </creator> </creators> <titles> <title>l-sized Training and Evaluation Data for Publication "Using Supervised Learning to Classify Metadata of Research Data by Field of Study"</title> </titles> <publisher>Zenodo</publisher> <publicationYear>2019</publicationYear> <subjects> <subject>research data</subject> <subject>disciplines of research</subject> <subject>supervised machine learning</subject> <subject>multi-label classification</subject> <subject>text processing</subject> <subject>data science</subject> <subject subjectScheme="url">https://dewey.info/</subject> <subject subjectScheme="url">https://dewey.info/</subject> </subjects> <dates> <date dateType="Issued">2019-10-15</date> </dates> <resourceType resourceTypeGeneral="Dataset"/> <alternateIdentifiers> <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/3490460</alternateIdentifier> </alternateIdentifiers> <relatedIdentifiers> <relatedIdentifier relatedIdentifierType="DOI" relationType="Compiles" resourceTypeGeneral="Dataset">10.5281/zenodo.3490329</relatedIdentifier> <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.3490459</relatedIdentifier> </relatedIdentifiers> <rightsList> <rights rightsURI="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights> <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights> </rightsList> <descriptions> <description descriptionType="Abstract"><p>Automated classification of metadata of research data by their discipline(s) of research can be used in scientometric research, by repository service providers, and in the context of research data aggregation services. Openly available metadata of the DataCite index for research data were used to compile a large training and evaluation set comprised of 609,524 records. This is the cleaned and vectorized version with a feature selection of large size.</p></description> </descriptions> </resource>
All versions | This version | |
---|---|---|
Views | 79 | 79 |
Downloads | 25 | 25 |
Data volume | 50.1 GB | 50.1 GB |
Unique views | 73 | 73 |
Unique downloads | 18 | 18 |