Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.
Published June 5, 2019 | Version v1
Dataset Open

RedMed: Extending drug lexicons for social media applications

  • 1. Stanford University

Description

Data associated with the RedMed project.

Details for the process behind the data creation can be found in the associated paper:

Lavertu, A. & Altman, R. B. "RedMed: Extending drug lexicons for social media applications"
Journal of Biomedical Informatics, (2019)

https://doi.org/10.1016/j.jbi.2019.103307

RedMed embedding model:

Word vectors trained on comments from health related subreddits and optimized for drug synonym retrieval.

The Redmed model was train using only social media data from Reddit and achieves comparable performance on the UMNSRS and MayoSRS similarity tasks. Vectors are 64 dimensional.

redmed_model_vectors.tsv.gz - Tab-separated word vectors (token\tdim1\tdim2\t...dim64)

redmed_model.bin - Binary word2vec file saved using gensim, can be loaded into python gensim

Other Files:

supp_file_1_sidebar_subreddits.txt - List of health-related subreddits based on "r/Health" and "r/Drugs" sidebars
supp_file_2_enrichment_based_subreddits.txt - List of health-related subreddits based on amount of health-related content
supp_file_3_custom_stopword_list.txt - List of stopwords based on counts derived from Reddit comments

Files

supp_file_1_sidebar_subreddits.txt

Files (1.8 GB)

Name Size Download all
md5:16aea66cff0b2d126fff9c06c821211e
2.5 MB Download
md5:c67fb2ccc34444b89f85878241a0ba89
837.5 MB Download
md5:f1e21bc54c84671a366f53edcecb7988
1.0 GB Download
md5:957107ef98ae680a4f72b7b7767cfbf0
2.0 kB Preview Download
md5:f12386f96e0200720a2206699ad6140c
30.1 kB Preview Download
md5:be00a1fe09c1b0255ac291ff0f30c49b
1.6 kB Preview Download

Additional details

Related works

Is cited by
Journal article: 10.1016/j.jbi.2019.103307 (DOI)

References

  • Lavertu, A. & Altman, R. B. "RedMed: Extending drug lexicons for social media applications" Journal of Biomedical Informatics (2019). https://doi.org/10.1016/j.jbi.2019.103307
  • 1. Lavertu, A. & Altman, R. B. RedMed: Extending drug lexicons for social media applications. bioRxiv (2019). doi:10.1101/663625