SialoCat: a curated catalog of bacterial proteins involved in sialic acid metabolism
Authors/Creators
Description
final_fastas.tar.gz
-
initial_fastas – Initially downloaded sequences, with redundancy removed both within and across the source databases used to compile the dataset.
-
FASTAS_aligned – Subset of sequences that aligned to at least one reference sequence.
-
FASTAS_with_essential_signature – Sequences containing the required essential signature.
-
FASTAS_with_essential_aligned – Sequences containing the essential signature that also passed the alignment filter.
-
FASTAS_without_extra – Sequences containing the essential signature and no additional non-reference signatures.
-
FASTAS_without_extra_aligned – Sequences containing the essential signature, no additional non-reference signatures, and that also satisfied the alignment criteria.
all_code_dataframes.tar.gz
Contains the mapping between original FASTA headers and the internal IDs used in the FASTA files provided in this repository.
Files
Files
(1.5 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:c65fe60a24a644660f17757115b5f54c
|
99.2 MB | Download |
|
md5:49264ab961cd8255c96e574278f1508e
|
1.4 GB | Download |