PUMA pipeline output
- 1. University of Liverpool
- 2. Newcastle University
- 3. University of Oxford
Description
Output of the PUMA (PUblications Metadata Augmentation) software pipeline which takes a list of journal articles and augments it with metadata from external sources. This augmented metadata is then processed to generate data files and an explorable/searchable set of HTML pages.
The PUMA pipeline is available at: https://github.com/OllyButters/puma and is described at: https://doi.org/10.12688/f1000research.25484.1
These attached files are the result of running the pipeline on the list of publications described at: https://doi.org/10.12688/wellcomeopenres.14986.1 on 2021-01-15. Rerunning the pipeline on this list may result in slightly different outputs due to the changing content of the external metadata sources.
Screenshots of the output HTML pages:
- PUMA_home_2021-01-15.png - Summary of all publications.
- PUMA_2011_2021-01-15.png - All publications from 2011.
- PUMA_map_2021-01-15.png - Choropleth map of first author's country.
- PUMA_asthma_2021-01-15.png - All publications with an asthma MeSH.
- PUMA_metrics_2021-01-15.png - Simple metrics.
- PUMA_word_cloud_2021-01-15.png - Word cloud of abstract text.
- PUMA_coverage_2021-01-15.png - Table showing completeness of metadata.
Generated data files
- authors.csv - Frequency of authors.
- first_authors.csv - Frequency of first authors.
- first_authors_inst.csv - Frequency of first authors' institutes.
- journals.csv - Frequency of journals published in.
- abstract_lemmatized.csv - Frequency of lemmatized abstract words.
- abstract_lemmatized_by_year.csv - Frequency of lemmatized abstract words broken down by year.
- title_lemmatized.csv - Frequency of lemmatized title words.
- title_lemmatized_by_year.csv - Frequency of lemmatized title words broken down by year.
- keywords_lemmatized.csv - Frequency of lemmatized keywords.
- keywords_lemmatized_by_year.csv - Frequency of lemmatized keywords broken down by year.
Files
abstract_lemmatized.csv
Files
(9.0 MB)
Name | Size | Download all |
---|---|---|
md5:8118a40ee843e166e2b49d1364837301
|
117.0 kB | Preview Download |
md5:e303a848305e6c9e3e3258fcfc92b4fe
|
593.2 kB | Preview Download |
md5:7ae90b2a77c46a8c86235a78ddb1ff7f
|
56.8 kB | Preview Download |
md5:43840efea928e1f8135f6cd01fe9e261
|
8.2 kB | Preview Download |
md5:d925a73c167d38c37a4c570590e4be92
|
5.7 kB | Preview Download |
md5:d1fdf71d8f966e5875826a0764e883d4
|
7.5 kB | Preview Download |
md5:6a3544e6abb69b172559dc2c186f9a47
|
23.1 kB | Preview Download |
md5:c2a280fe52cbda331dffe352641d9ff9
|
118.5 kB | Preview Download |
md5:062603da983c655f5682479fd41692c0
|
4.4 MB | Preview Download |
md5:089e5c73efcd6b1b26d60c6f1e054448
|
1.9 MB | Preview Download |
md5:96e04cf25457d80af9a078c94119f604
|
781.1 kB | Preview Download |
md5:e1be84d09dd8beea9b7d592d8b7e08be
|
172.0 kB | Preview Download |
md5:8e21e37dea315d2bb3eb40d839d72ec3
|
169.5 kB | Preview Download |
md5:6f3b149758494486309bd475bfbccfc8
|
210.0 kB | Preview Download |
md5:6cbf6c73d2474f08c0783eb2e7db7ba7
|
264.8 kB | Preview Download |
md5:04e5c6daab6833cc86dd1a2e33ae105e
|
28.6 kB | Preview Download |
md5:cd899e69a61ba833df56909835ae7423
|
147.8 kB | Preview Download |
Additional details
Related works
- Is derived from
- Journal article: 10.12688/f1000research.25484.1 (DOI)
- Journal article: 10.12688/wellcomeopenres.14986.1 (DOI)