Contents and description of files included in this repository: merged_go_data.tsv, final output of the functional annotation pipeline's Gene Ontology annotation results merged into a single long tabular file for all samples. File contains three columns 1) the sample run identifiers, 2) the Gene Ontology identifiers 3) the occurrence count values. merged_interpro_data.tsv, final output of the functional annotation pipeline's InterPro protein annotation results merged into a single long tabular file for all samples. File contains three columns 1) the sample run identifiers, 2) the InterPro identifiers 3) the occurrence count values. merged_ncbitaxon_data.tsv, final output of the taxonomic annotation pipeline's Kraken2 taxonomic annotation results merged into a single long tabular file for all samples. File contains three columns 1) the sample run identifiers, 2) the NCBITaxon identifiers 3) the occurrence count values. sample_metadata.tsv, a list containing the accession numbers, NCBI BioSamples identifiers, and additional information about the final set of 819 samples used in the manuscript's RDF database. original_unfiltered_sample_list.csv, a list containing the accession numbers and NCBI BioSamples identifiers of all the original prokaryote enriched metagenomic samples collected from the Planet Microbe Database prior to quality filtering.