Genomics NCD-gzip database

EESI lab

doi:10.5281/zenodo.15635653

Published June 10, 2025 | Version v1

Dataset Open

Genomics NCD-gzip database

EESI lab

This is a basic database consisting of 4634 genomes sampled from the RefSeq database using the Woltka pipelin. It is intended for testing and evaluation of metagenomic classification tools.

Additional Files

fold1_list.txt and fold1_testing_list.txt: Lists of genome TaxIDs used for training and testing, respectively. These are included to support reproducible benchmarking of metagenomic classifiers.

Files

fold1_list.txt

Files (3.1 GB)

Name	Size	Download all
fold1_list.txt md5:b6a685d31e9da969dd045318bc45c6ee	27.6 kB	Preview Download
fold1_testing_list.txt md5:2d21602727c0403f8209cd8a80e0586c	6.9 kB	Preview Download
fold_full_list.txt md5:b2ff962741d222418a4619aab68ff734	34.5 kB	Preview Download
fold_full_seq.tar.gz md5:57a08afc0752c8fea51d2dfef7004af0	3.1 GB	Download

	All versions	This version
Views	21	21
Downloads	21	21
Data volume	15.7 GB	15.7 GB

Genomics NCD-gzip database

Authors/Creators

Description

Contents

Additional Files

Files

fold1_list.txt

Files (3.1 GB)