Published January 29, 2026 | Version v1
Dataset Embargoed

Association summary statistics, FUMA inputs, and fine-mapping results of Güler, M., et al. "A meta-analysis of genome-wide association studies identify fourteen new pancreatic cancer risk loci"

Description

Overview

Pancreatic cancer has a substantial inherited component, but the genes and biological pathways mediating genetic susceptibility remain incompletely defined. Previous genome-wide association studies have focused predominantly on populations of European ancestry.

We performed a large genome-wide association study meta-analysis across multiple ancestries, including more than 18,000 pancreatic cancer cases and 1.5 million controls. The association results were integrated with statistical fine-mapping and pancreas-relevant multi-omics resources. Candidate effector genes were prioritized using fine-mapping, functional variant annotation, cis-molecular quantitative trait loci, transcriptome-wide association and summary-data-based Mendelian randomization analyses, enhancer–promoter maps, and single-nucleus pancreas multiome data. Polygenic risk scores were evaluated in the Estonian Biobank and the All of Us Research Program.

We identified 34 genome-wide significant susceptibility loci. Integration of genetic and functional evidence prioritized genes preferentially expressed in pancreatic ductal and acinar cell lineages and implicated pathways related to pancreatic development, epithelial differentiation, and developmental signaling. Incorporating additional susceptibility loci produced limited improvements in polygenic risk prediction, but the overlap between inherited susceptibility signals and pancreatic developmental programs highlights biological processes linking germline genetic variation to pancreatic cancer risk.

This Zenodo record contains association summary statistics, files prepared for FUMA, and statistical fine-mapping results supporting the study. No individual-level genotype, phenotype, or participant data are included.

Contents of this record

Filename or pattern Analysis Build Description
EUR_*clean_metal* European METAL GRCh38 Standardized fixed-effect summary statistics
trans_ancestry_*clean_metal* Multi-ancestry METAL GRCh38 Standardized multi-ancestry fixed-effect statistics
*mrmega*.clean* MR-MEGA GRCh38 Standardized multi-ancestry meta-regression results
fe_eur_hg19* FUMA input GRCh37 European analysis converted for FUMA (variant IDs GRCh38)
fe_ta_hg19* FUMA input GRCh37 Multi-ancestry METAL analysis converted for FUMA (variant IDs GRCh38)
mrmega_hg19* FUMA input GRCh37 MR-MEGA analysis converted for FUMA (variant IDs GRCh38)
*credible_snps* SuSiE GRCh37 Variant-level PIP and credible-set results
*credible_set_info* SuSiE GRCh37 Credible-set coverage and purity
final_credible_sets* SuSiE GRCh37 Merged credible variants across loci
final_credible_info* SuSiE GRCh37 Merged credible-set summaries
*_fm_regionalplot* SuSiE GRCh37 Per-locus regional fine-mapping plots

1. Fixed-effect meta-analysis results generated with METAL

Results are provided for:

  • European-ancestry fixed-effect meta-analysis

  • Multi-ancestry fixed-effect meta-analysis, referred to as “trans-ancestry” in some filenames and analysis scripts

Both native METAL outputs and standardized, analysis-ready versions are included.

The formatted METAL results:

  • use genomic coordinates based on GRCh38

  • represent variants as CHR:POS:REF:ALT

  • align the reported effect to the ALT allele

  • include alternate-allele frequency, effect estimate, standard error, association P value, cohort-specific effect directions, heterogeneity statistics, total sample size, case count, and control count

  • retain variants meeting the study-level contribution and allele-frequency quality-control criteria

Where provided, files containing hetp0.05 in their names additionally exclude variants with evidence of between-study heterogeneity at HetPVal < 0.05.

2. Multi-ancestry meta-regression results generated with MR-MEGA

The MR-MEGA results model ancestry-correlated heterogeneity in allelic effects using three ancestry principal components.

Both native MR-MEGA output and a standardized formatted version are included. The formatted file is based on GRCh38 and contains:

  • variant identifier and chromosome position

  • reference and effect alleles

  • effect-allele frequency

  • MR-MEGA association P value

  • meta-regression intercept coefficient and standard error (BETA0 and SE0)

  • cohort-specific effect directions

  • total sample size and number of contributing cohorts

  • ancestry-correlated and residual heterogeneity P values

  • log Bayes factor

  • coefficients and standard errors for the three ancestry principal components

Extremely small association P values that exceeded the numerical reporting range of the original MR-MEGA implementation were recalculated using the corresponding chi-square statistics and degrees of freedom.

Because MR-MEGA is a meta-regression model, its coefficients should be interpreted according to the MR-MEGA model and should not be treated as directly interchangeable with a conventional fixed-effect METAL effect estimate.

3. FUMA input files

FUMA-compatible files are provided for:

  • the European-ancestry fixed-effect meta-analysis

  • the multi-ancestry fixed-effect meta-analysis

  • the MR-MEGA multi-ancestry analysis

The primary association analyses were conducted using GRCh38 coordinates. These FUMA input files were lifted over to GRCh37/hg19 because FUMA requires hg19-compatible coordinates. Chromosome, position, and allele consistency were checked following liftover.

These files are downstream, tool-specific derivatives and should not be confused with the primary GRCh38 association summary-statistic files.

4. SuSiE statistical fine-mapping results

SuSiE fine-mapping results are included separately for:

  • the European-ancestry fixed-effect meta-analysis

  • the multi-ancestry fixed-effect meta-analysis

SuSiE was applied using GWAS summary statistics and linkage-disequilibrium information from a UK Biobank reference panel. Locus-specific analyses used approximately 1-Mb windows centered on the selected locus position. Credible sets were calculated at 95% coverage.

The deposited fine-mapping outputs include:

  • variant-level results containing posterior inclusion probabilities

  • credible-set membership

  • linkage disequilibrium, reported as (R^2), with the lead variant

  • credible-set coverage and purity summaries

  • merged cross-locus credible-set tables

  • regional fine-mapping plots

Each regional plot displays:

  1. regional GWAS association evidence as (-\log_{10}(P))

  2. variant posterior inclusion probabilities

  3. credible-set membership and linkage disequilibrium with the lead variant

  4. genes located within the analyzed region

The SuSiE results included here correspond to the fixed-effect European and multi-ancestry analyses; they are not SuSiE analyses of the MR-MEGA results.

File and data conventions

  • Primary association-result genome build: GRCh38

  • FUMA input genome build: GRCh37/hg19

  • Variant identifier: CHR:POS:REF:ALT

  • Effect allele in formatted association files: ALT

  • Text tables are tab-delimited and may be gzip-compressed

  • “Raw” or “native” files preserve software-specific output

  • “Formatted” or “clean” files use standardized column names, allele orientation, and study quality-control procedures

  • Missing values are represented using the conventions of the corresponding output format

  • The deposited files contain summary-level research results only

Users should verify the genome build and effect-allele convention before combining these files with external resources.

Analysis code and reproducibility

Scripts used to harmonize cohort-level GWAS results, perform the METAL and MR-MEGA analyses, generate FUMA inputs, conduct statistical fine-mapping, and perform downstream analyses are available from:

PDAC-MA-GWAS GitHub repository

The GitHub repository provides the complete analysis workflow, software requirements, script descriptions, and example commands. The files in this Zenodo record represent archived study outputs, whereas the GitHub repository contains the associated computational workflow.

Citation

Güler, M., et al. A meta-analysis of genome-wide association studies identify fourteen new pancreatic cancer risk loci. Manuscript under review (2026).

The journal citation and article DOI will be added to the Zenodo metadata when available.

Files

Embargoed

The files will be made publicly available on December 1, 2030.

Reason: After peer-review all data will become publicly available.