There is a newer version of the record available.

Published May 9, 2019 | Version v2.1
Software Open

angelosalatino/cso-classifier: CSO Classifier v2.1

  • 1. Open University

Description

The CSO Classifier is an application that takes as input the text from abstract, title, and keywords of a research paper and outputs a list of relevant concepts from CSO. This new release (version v2.1) aims at improving its scalability. Compared to its previous version (v2.0), the classifier relies on a cached word2vec model which connects the words within the model vocabulary directly with the CSO topics. Thanks to this cache, the classifier is able to quickly retrieve all CSO topics that could be inferred by given tokens, speeding up the processing time. In addition, this cache is lighter (~64M) compared to the actual word2vec model (~366MB), which allows to save additional time at loading time. Thanks to this improvement the CSO Classifier is around 24x faster and can be easily run on large corpus of scholarly data.

Files

angelosalatino/cso-classifier-v2.1.zip

Files (14.3 MB)

Name Size Download all
md5:f23adf41dd8f078417d963c636ca4436
14.3 MB Preview Download

Additional details