AmelHap pilot: raw data
Creators
- 1. University of the Basque Country
- 2. University of Edinburgh
- 3. Beebytes Analytics CIC
- 4. INRAE Toulouse
Description
Honey bee Apis mellifera drones are typically haploid, developing from an unfertilized egg, inheriting only their queen’s alleles and none from the many drones she mated with. Being haploid, the ordered combination or ‘phase’ of alleles is known, making drones a valuable haplotype resource. We collated whole genome sequence data for 688 drones, including 45 newly sequenced Scottish drones, which collectively represent 13 countries, 7 subspecies and various hybrids strains. After alignment to the reference assembly Amel_Hav3.1, and haploid variant calling, we identified 18.9M variants.
Whole-genome sequencing data underpinning the dataset is available from the European Nucleotide Archive (ENA), https://www.ebi.ac.uk/ena, with the project accession codes: PRJEB16533, PRJNA311274, PRJNA363032, PRJNA516678, PRJNA544324, and PRJEB39369.
Sequencing reads were aligned to the Amel_HAv3.1 reference genome using BWA-MEM v0.7.17. Reads were sorted with SAMtools v1.9 and duplicates marked (MarkDuplicates) with GATK v4.0.11.0. Variants for each sample were called using GATK’s HaplotypeCaller with the following non-default parameters --ERC GVCF, --sample-ploidy 1 and -A AlleleFraction. Joint variant calling was performed across all samples using GATK’s GenomicDBImport and GenotypeGVCFs with --sample-ploidy 1 and a window size of 2.5 Mb.
This dataset is unfiltered, and contains all variants regardless of quality or call rate.
Files
Files
(29.7 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:b45e40f7d2b22ad7264de40108591d57
|
29.7 GB | Download |
|
md5:264e23d32035a775e52f5f8bbcf3dc03
|
195.7 kB | Download |
|
md5:d112061516f9cfe745887682da9186d7
|
45 Bytes | Download |
|
md5:d44282d8af52c80f64c70590278a8882
|
34.3 kB | Download |