Published 2025
| Version v2
Dataset
Open
Acanthaster whole genome datasets and scripts
Authors/Creators
Description
These whole-genome datasets were analyzed to assess the connectivity patterns of Crown-of-Thorns Seastars (COTS, Acanthaster cf. solaris) across the Pacific Ocean (Leiva et al., 2025, BMC Biology). Find below the description of each dataset and script:
- Aca_206_ind_allSNPs_chr1_renamed.vcf.gz: contains genotype calls from all SNPs of the longest scaffold, for all 206 Acanthaster cf. solaris samples. It was produced by ANGSD and filtered in bcftools, and was used to detect contaminated samples in the dataset with verifyBamID.
- Aca_198_ind_thin10kSNPs.beagle.gz: contains genotype likelihoods from 198 non-contaminated Acanthaster cf. solaris samples. It was produced by ANGSD, thinned with vcftools, and transformed to genotype likelihoods again by ANGSD. It was used to assess population connectivity, structure and diversity.
- Aca_198_ind_think10kSNPs.recode.vcf: contains genotype calls from 198 non-contaminated Acanthaster cf. solaris samples. It was produced by ANGSD and thinned with vcftools. It was used for population structure analyses.
- T_mod_ANGSD_Haplo_09filt.fasta: fasta file with haplotype calls from 198 non-contaminated Acanthaster cf. solaris samples, plus 2 COTS samples from the Gulf of California, plus 2 Acanthaster planci samples from the Indian Ocean, plus 2 Acanthaster benziei from the Red Sea. It was produced by ANGSD using the -doHaploCall 2 flag, and then transformed to fasta using a custom R script (see "reformating_ANGSD_fasta.R").
- Aca_Hawaii_200scaffolds.vcf.gz: vcf file with genotype calls and genotype likelihoods from the first (longest) 200 scaffolds of the COTS samples from Hawai'i. File used as input for RAiSD.
- Aca_French_Polynesia_200scaffolds.vcf.gz: vcf file with genotype calls and genotype likelihoods from the first (longest) 200 scaffolds of the COTS samples from French Polynesia. File used as input for RAiSD.
- Aca_West_Pacific_200scaffolds.vcf.gz: vcf file with genotype calls and genotype likelihoods from the first (longest) 200 scaffolds of the COTS samples from the West Pacific. File used as input for RAiSD.
- reformating_ANGSD_fasta.R: R script used to transform haplotype calls from ANGSD (-doHaploCall 2) into a fasta file to perform phylogenetic analyses with iqtree2.
- plot_relatedness2_from_vcftools.R: R script used to plot dendrogram and heatmap from relatedness data from vcftools --relatedness2.
- pca_and_plot.R: R script used to perform and plot a PCA from the covariance matrix obtained with PCAngsd.
- pairwise_FSTs.R: R script used to calculate and plot Pairwise Fst distances among populations.
Files
Files
(39.8 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:4729fc86b209a7319bd05c154a968066
|
19.1 MB | Download |
|
md5:c75c3e262a12567f968c50bd4252e007
|
479.1 MB | Download |
|
md5:d8a806f08ba17da34183ddd162ce0be8
|
347.8 MB | Download |
|
md5:0d1dd88d6ac20549fc488098dc68420d
|
5.9 GB | Download |
|
md5:857e0e7bd5ee238e5f5feb3524c23e68
|
230.3 MB | Download |
|
md5:ed89a21d81a659813560d2b6d8857679
|
31.8 GB | Download |
|
md5:32a7582d271fbdd03fa6fa15227fc154
|
1.5 kB | Download |
|
md5:9ff03e1bbae3fc6d1b8de8b626fbc6da
|
1.7 kB | Download |
|
md5:8279e6c0e9ce14a8840b44cd60ec0985
|
470 Bytes | Download |
|
md5:228c3a5847d6f08eb16c97d2c009bec5
|
2.2 kB | Download |
|
md5:749d800fff57dc7fe23b6105d7547a7d
|
939.2 MB | Download |