Published April 2, 2023
| Version v1.0.0
Dataset
Open
Protein Function Embeddings: First Beta Release of Datasets
- 1. University of Bonn
- 2. ZB MED - Information Centre for Life Sciences
Contributors
Supervisors:
- 1. Bonn-Aachen International Centre for Information Technology (B-IT), University of Bonn
- 2. University of Cologne
Description
This release corresponds to the datasets generated from a thesis work that explores how information for protein functions can be exploited through embeddings so that the produced information can be used to improve protein function annotations. The underlying hypothesis here is that any pair of proteins with high sequence similarity will also share a similar biological function which would be reflected by the corresponding protein embeddings. The comparison and evaluation of this is done using two text-driven embedding approaches: Word2doc2Vec and Hybrid-Word2doc2Vec.
Files
annotations.zip
Files
(7.6 GB)
Name | Size | Download all |
---|---|---|
md5:b986686409ca84357f74d06734ef2c10
|
193.7 MB | Preview Download |
md5:3b84b7cb871581570ffe907b42e1af7f
|
776.6 MB | Preview Download |
md5:84424e2145ac304b747b2aa05e61a57e
|
3.6 GB | Preview Download |
md5:ef61a4d5ed13d4872060974f8c7a1d99
|
2.8 kB | Preview Download |
md5:0ca4e6c05669ceb2bb7f6e9dc6119266
|
2.3 GB | Preview Download |
md5:b16ccdb72d70ce4b39983980f108cf93
|
117.5 kB | Preview Download |
md5:2d95252227718f9d09c07166c89b6532
|
114.5 MB | Preview Download |
md5:97b99720fd4048dc8e2613f9664b0f3c
|
171.7 MB | Preview Download |
md5:8384bbe965be628bb4f2948fbfccaa16
|
363.9 MB | Preview Download |
md5:bdae70019b15528bdb22d9df25273c94
|
1.9 MB | Preview Download |
Additional details
Related works
- Is derived from
- Software: 10.5281/zenodo.7781870 (DOI)