Published March 30, 2026 | Version v2
Dataset Open

Fantasia Embeddings GOA2024

Description

Database of UniProt Entries with Corresponding Embeddings

This database contains all UniProt entries and their corresponding embeddings, calculated using the following models:

  • ProstT5
  • ProtT5
  • ESM2

The database is implemented in PostgreSQL and uses the pgvector extension to efficiently store and query high-dimensional embedding vectors. These embeddings are utilized in the FANTASIA pipeline to generate sequence annotations.

Tags for GOA2022 have been added.

Files

Files (15.2 GB)

Name Size Download all
md5:44d469bfacfa19671f0f6fcff0f83b99
15.2 GB Download