Published April 9, 2025 | Version v4
Journal article Open

Generalized Biological Foundation Model with Unified Nucleic Acid and Protein Language

  • 1. ROR icon Alibaba Group (China)

Description

The resources of LucaOne, including:

the model code, training scripts, embedding inference code, and trained checkpoints of LucaOne; the model code, training and evaluation scripts, datasets, trained checkpoints for downstream tasks, and additional supplementary materials.  

 

Due to the pre-training dataset of LucaOne is too large (1.8T), it has been deposited into CNGB Sequence Archive (CNSA) with accession number CNP0007266.

 

This new version updated the zip file: Supplementary.tar.gz, which contains more files and data.


Please refer to GitHub:  https://github.com/LucaOne?tab=repositories for the latest code of projects: LucaOne, LucaOneApp, and LucaOneTasks.

 

Since the compressed file of LucaOne CheckPoints (`TrainedCheckPoint.tar.gz`, about `33G`) is too large, it has been split into three smaller parts. Before decompressing, use the command `cat` to merge them, and then proceed with decompression.      

``` 
cat TrainedCheckPoint.tar.gz-part-* > TrainedCheckPoint.tar.gz      

tar -xzvf TrainedCheckPoint.tar.gz       
```

 

Files

README.md

Files (36.5 GB)

Name Size Download all
md5:535d9cd3730c4cb05b8a58e8bcc80703
175.1 MB Download
md5:92ed374dd165f5083a6d3806ca01c497
1.4 GB Download
md5:c19d1c474a89d87aa9cca7f8608283c4
2.5 MB Download
md5:189a3347a7785e2c23e2d94f5b4dbb89
3.1 MB Download
md5:d95ad6cd6942bbfb82553c9869aef051
2.2 MB Download
md5:4145f7e39963f36560cf4172750357b8
3.2 kB Preview Download
md5:6392ede85efcdf04793fd3302375b0e1
1.8 GB Download
md5:28ae0350ad14076de5b63c7d116bb738
108.7 kB Download
md5:bcb641f447fb77e301d1eb14cfba4986
6.4 GB Download
md5:876c6c374503bae48d7c530021c536f3
6.4 GB Download
md5:04d83f6674f6b98b6be7dbf28b2d3770
6.4 GB Download
md5:37ddca6822d166a1314158f31db740da
6.4 GB Download
md5:d4e127295fda9d976cbd73241b0f1ccd
6.4 GB Download
md5:754261613bee60f4ab9f23994beafc6f
950.3 MB Download

Additional details

Dates

Available
2015-01-06
Available

Software

Programming language
Python , Shell , SQL