Published April 23, 2018 | Version v1
Dataset Open

Data associated with "A collaborative filtering based approach to biomedical knowledge discovery"

Creators

  • 1. University of British Columbia

Description

This is the data set associated with the publication: "A collaborative filtering based approach to biomedical knowledge discovery" published in Bioinformatics.

The data are sets of cooccurrences of biomedical terms extracted from published abstracts and full text articles. The cooccurrences are then represented in sparse matrix form. There are three different splits of this data denoted by the prefix number on the files.

1. All - All cooccurrences combined in a single file

2. Training/Validation - All cooccurrences in publications before 2010 in training, all novel cooccurrences in publication in 2010 go in validation

3. Training+Validation/Test - All cooccurrences in publication upto and including 2010 in training+validation. All novel cooccurrences after 2010 in year by year increments and also all combined together

 

Furthermore there are subset files which are used in some experiments to deal with the computational cost of evaluating the full set. The associated cuids.txt file containing a link between the row/column in the matrix with the UMLS Metathesaurus CUIDs. Hence the first row of cuids.txt matches up to the 0th row/column in the matrix. Note that the matrix is square and symmetric. This work was done with UMLS Metathesaurus 2016AB.

Files

Files (2.8 GB)

Name Size Download all
md5:d29993ed31469009e901fad16651fe64
603.2 MB Download
md5:f2562e68e712a1600fe1a8f001a024b4
337.0 MB Download
md5:c07bbde1a00e185452e6b3871796bb69
1.1 MB Download
md5:4e9e4a4618a2480bb7a542f3abb9c38c
56 Bytes Download
md5:14406924a0e730017357e4b70349c3c3
30.7 MB Download
md5:2127e7a9f13590e3b512ec97df08c415
4.5 MB Download
md5:f90dec532df58826c56a2983cf0bfa74
34.3 MB Download
md5:92d543391e16d459cecbf30bf6e148ea
117.8 MB Download
md5:4c848ba42f5385d335d6a0c7f549e87d
4.6 MB Download
md5:f1397b45d38bd0f3349b3957012471d9
40.3 MB Download
md5:9d9c23b07c94bf4699e6732462336ec2
138.5 MB Download
md5:3c4d32d447419ef368519c3a4a678294
4.6 MB Download
md5:c161d3794219602d7e621cc664a4dd91
44.3 MB Download
md5:f19ff5b8d6860babedb45fa2c3d06e94
155.5 MB Download
md5:fb35cff7d715412a78e4cfb6c75c057c
4.6 MB Download
md5:f95c9882642ba6ae20a456df29a115c3
44.2 MB Download
md5:7df555b7765c387e40f18daee4fd4a23
165.1 MB Download
md5:9999d5e0ce7aa7fe2733a5251c83b898
4.6 MB Download
md5:d3f57bd77351e05aa7c154b606890088
46.0 MB Download
md5:9bb0540299b179a7d1b6c54d20bef331
177.5 MB Download
md5:8d73260a9d7a19cd2b0b17d0196d591d
4.6 MB Download
md5:5e66d32793f401764e73302490a7a336
44.8 MB Download
md5:444312110c452d4138dc2acb882183f9
183.3 MB Download
md5:c245679a1da8a60e85a03fea6c98c862
4.6 MB Download
md5:74168e412b22f080a48797543e494ce9
2.1 MB Download
md5:0c3a564171c029e6acbcf7c1403970cf
21.1 MB Download
md5:ea3a060f5709e2ae1da295324d6a3295
2.1 MB Download
md5:31551bb94d770269b29006cf6d626b04
228.1 MB Download
md5:011ff7e978065a2755a72f87a3ba4c90
4.7 MB Download
md5:0476fd25500279965a98f9ae8e083dc1
365.1 MB Download
md5:7db03ea09725edf84dd866eaee70bd6b
1.1 MB Download
md5:011e7463a012a44b979bce22e5630348
69 Bytes Download
md5:a0cdd86bc942ddd46d3c128d4141077c
3.2 MB Download