Published July 28, 2025
| Version v1
Dataset
Open
Dataset and Code for PIDE Binary Classification Model
Creators
Description
-
phage.fasta
: 263,843 phage protein sequences (from UniProt). -
bacteria.fasta
: 263,843 bacterial protein sequences (from UniProt). -
sequence_*.txt
: Embedding files for train/val/test sets (precomputed from input sequences). -
label_*.txt
: Corresponding binary labels (0: bacteria, 1: phage). -
*.py
files: Training and evaluation scripts (input: embedding files).
Files
label_test.txt
Files
(1.3 GB)
Name | Size | Download all |
---|---|---|
md5:7458115aa1324f763a383f64cc09a74a
|
100.4 MB | Download |
md5:32c1abb39149e51d4f467b5a52227ba1
|
4.1 kB | Download |
md5:e7d5064347046215aada65614d05afbd
|
4.3 kB | Download |
md5:648021233ea4eb0a8740d70fe306308e
|
158.3 kB | Preview Download |
md5:3d20f31053d19db6b3f270ed01feda37
|
738.8 kB | Preview Download |
md5:9a6f6be411e6b6e54f47773ffe8615ae
|
158.3 kB | Preview Download |
md5:7c1e19dabce54ea915d6f85a54a99aa5
|
78.3 MB | Download |
md5:36720f0a627e7373ce7f3cd424291e0e
|
171.5 MB | Preview Download |
md5:fe9eb180afd42c0835deee2d8a93645b
|
800.2 MB | Preview Download |
md5:75456564e793d18d1f9260ff1bccaa24
|
171.5 MB | Preview Download |