Presentation Open Access
The Coleridge Initiative (https://coleridgeinitiative.org/) is aiming to make data more usable and available in the social sciences, by connecting research papers to the underlying data and creating infrastructure that provides access to data in computational environments via project Jupyter. This talk will look at one approach taken by the initiative - using a competition (https://coleridgeinitiative.org/richcontextcompetition) in order to encourage teams to help solve one of the core problems, i.e. building machine learning models that identify references to data sets with no standard identifiers.
Data was supplied for the competition by SAGE publishing and Bundesbank, and funding came from the Sloan Foundation. The competition recieved applications with working software code from twenty teams across the world, with team composition ranging from fully undergraduate through to teams of senior researchers. Submissions were shortlisted and the final six teams were brought together in person, to encourage knowledge sharing and collaboration. All of the outputs were made available under open licences. In this talk, we will briefly discuss the wider project and also analyse the ways in which the competition was a success, as well as ways in which we could make improvements if we were to use this approach again in future.