RedMed: Extending drug lexicons for social media applications
Description
Data associated with the RedMed project.
Details for the process behind the data creation can be found in the associated paper:
Lavertu, A. & Altman, R. B. "RedMed: Extending drug lexicons for social media applications"
Journal of Biomedical Informatics, (2019)
https://doi.org/10.1016/j.jbi.2019.103307
RedMed embedding model:
Word vectors trained on comments from health related subreddits and optimized for drug synonym retrieval.
The Redmed model was train using only social media data from Reddit and achieves comparable performance on the UMNSRS and MayoSRS similarity tasks. Vectors are 64 dimensional.
redmed_model_vectors.tsv.gz - Tab-separated word vectors (token\tdim1\tdim2\t...dim64)
redmed_model.bin - Binary word2vec file saved using gensim, can be loaded into python gensim
Other Files:
supp_file_1_sidebar_subreddits.txt - List of health-related subreddits based on "r/Health" and "r/Drugs" sidebars
supp_file_2_enrichment_based_subreddits.txt - List of health-related subreddits based on amount of health-related content
supp_file_3_custom_stopword_list.txt - List of stopwords based on counts derived from Reddit comments
Files
supp_file_1_sidebar_subreddits.txt
Files
(1.8 GB)
Name | Size | Download all |
---|---|---|
md5:16aea66cff0b2d126fff9c06c821211e
|
2.5 MB | Download |
md5:c67fb2ccc34444b89f85878241a0ba89
|
837.5 MB | Download |
md5:f1e21bc54c84671a366f53edcecb7988
|
1.0 GB | Download |
md5:957107ef98ae680a4f72b7b7767cfbf0
|
2.0 kB | Preview Download |
md5:f12386f96e0200720a2206699ad6140c
|
30.1 kB | Preview Download |
md5:be00a1fe09c1b0255ac291ff0f30c49b
|
1.6 kB | Preview Download |
Additional details
Related works
- Is cited by
- Journal article: 10.1016/j.jbi.2019.103307 (DOI)
References
- Lavertu, A. & Altman, R. B. "RedMed: Extending drug lexicons for social media applications" Journal of Biomedical Informatics (2019). https://doi.org/10.1016/j.jbi.2019.103307
- 1. Lavertu, A. & Altman, R. B. RedMed: Extending drug lexicons for social media applications. bioRxiv (2019). doi:10.1101/663625