Deciphering colorectal cancer genetics through multi-omic analysis of 100,204 cases and 154,587 controls of European and East Asian ancestries
Creators
- Ceres Fernandez-Rozadilla1
- Maria Timofeeva1
- Zhishan Chen
- Philip Law
- Minta Thomas
- Stephanie Schmit
- Virginia Díez-Obrero
- Li Hsu
- Juan Fernandez-Tajes
- Claire Palles
- Kitty Sherwood
- Sarah Briggs
- Victoria Svinti
- Kevin Donnelly
- Susan Farrington
- James Blackmur
- Peter Vaughan-Shaw
- Xiao-ou Shu
- Jirong Long
- Qiuyin Cai
- Xingyi Guo
- Yingchang Lu
- Peter Broderick
- James Studd
- Jeroen Huyghe
- Tabitha Harrison
- David Conti
- Christopher Dampier
- Mathew Devall
- Fredrick Schumacher
- Marilena Melas
- Gad Rennert
- Mireia Obón-Santacana
- Vicente Martín-Sánchez
- Ferran Moratalla-Navarro
- Jae Hwan Oh
- Jeongseon Kim
- Sun Ha Jee
- Keum Ji Jung
- Sun-Seog Kweon
- Min-Ho Shin
- Aesun Shin
- Yoon-Ok Ahn
- Dong-Hyun Kim
- Isao Oze
- Wanqing Wen
- Keitaro Matsuo
- Koichi Matsuda
- Chizu Tanikawa
- Zefang Ren
- Yu-Tang Gao
- Wei-Hua Jia
- John Hopper
- Mark Jenkins
- Aung Ko Win
- Rish Pai
- Jane Figueiredo
- Robert Haile
- Steven Gallinger
- Michael Woods
- Polly Newcomb
- David Duggan
- Jeremy Cheadle
- Richard Kaplan
- Timothy Maughan
- Rachel Kerr
- David Kerr
- Iva Kirac
- Jan Böhm
- Lukka-Pekka Mecklin
- Pekka Jousilahti
- Paul Knekt
- Lauri Aaltonen
- Harri Rissanen
- Eero Pukkala
- Johan Eriksson
- Tatiana Cajuso
- Ulrika Hänninen
- Johanna Kondelin
- Kimmo Palin
- Tomas Tanskanen
- Laura Renkonen-Sinisalo
- Brent Zanke
- Satu Männistö
- Demetrius Albanes
- Stephanie Weinstein
- Edward Ruiz-Narvaez
- Julie Palmer
- Daniel Buchanan
- Elizabeth Platz
- Kala Visvanathan
- Cornelia Ulrich
- Erin Siegel
- Stefanie Brezina
- Andrea Gsur
- Peter Campbell
- Jenny Chang-Claude
- Michael Hoffmeister
- Hermann Brenner
- Martha Slattery
- John Potter
- Konstantinos Tsilidis
- Matthias Schulze
- Marc Gunter
- Neil Murphy
- Antoni Castells
- Sergi Castellví-Bel
- Leticia Moreira
- Volker Arndt
- Anna Shcherbina
- Mariana Stern
- Bens Pardamean
- Timothy Bishop
- Graham Giles
- Melissa Southey
- Gregory Idos
- Kevin McDonnell
- Zomoroda Abu-Ful
- Joel Greenson
- Katerina Shulman
- Flavio Lejbkowicz
- Kenneth Offit
- Yu-Ru Su
- Robert Steinfelder
- Temitope Keku
- Bethany van Guelpen
- Thomas Hudson
- Heather Hampel
- Rachel Pearlman
- Sonja Berndt
- Richard Hayes
- Marie Elena Martinez
- Sushma Thomas
- Douglas Corley
- Paul Pharoah
- Susanna Larsson
- Yun Yen
- Heinz-Josef Lenz
- Emily White
- Li Li
- Kimberly Doheny
- Elizabeth Pugh
- Tameka Shelford
- Andrew Chan
- Marcia Cruz-Correa
- Annika Lindblom
- David Hunter
- Amit Joshi
- Clemens Schafmayer
- Peter Scacheri
- Anshul Kundaje
- Deborah Nickerson
- Robert Schoen
- Jochen Hampe
- Zsofia Stadler
- Pavel Vodicka
- Ludmila Vodickova
- Veronika Vymetalkova
- Nickolas Papadopoulos
- Chistopher Edlund
- William Gauderman
- Duncan Thomas
- David Shibata
- Amanda Toland
- Sanford Markowitz
- Andre Kim
- Stephen Chanock
- Franzel van Duijnhoven
- Edith Feskens
- Lori Sakoda
- Manuela Gago-Dominguez
- Alicja Wolk
- Alessio Naccarati
- Barbara Pardini
- Liesel FitzGerald
- Soo Chin Lee
- Shuji Ogino
- Stephanie Bien
- Charles Kooperberg
- Christopher Li
- Yi Lin
- Ross Prentice
- Conghui Qu
- Stéphane Bézieau
- Catherine Tangen
- Elaine Mardis
- Taiki Yamaji
- Norie Sawada
- Motoki Iwasaki
- Christopher Haiman
- Loic Le Marchand
- Anna Wu
- Chenxu Qu
- Caroline McNeil
- Gerhard Coetzee
- Caroline Hayward
- Ian Deary
- Sarah Harris
- Evropi Theodoratou
- Stuart Reid
- Marion Walker
- Li Yin Ooi
- Victor Moreno
- Graham Casey
- Stephen Gruber
- Ian Tomlinson
- Wei Zheng
- Malcolm Dunlop1
- Richard Houlston
- Ulrike Peters
- 1. UoE
Description
Colorectal cancer (CRC) is a leading cause of mortality worldwide. We conducted a genome-wide association study meta-analysis of 100,204 CRC cases and 154,587 controls of European and Asian ancestry, identifying 205 independent risk associations, of which 50 were unreported. We performed integrative genomic, transcriptomic and methylomic analyses across large bowel mucosa and other tissues. Transcriptome- and methylome-wide association studies revealed an additional 53 risk associations. We identified 155 high confidence effector genes functionally linked to CRC risk, many of which had no previously established role in CRC. These have multiple different functions, and specifically indicate that variation in normal colorectal homeostasis, proliferation, cell adhesion, migration, immunity and microbial interactions determines CRC risk. Cross-tissue analyses indicated that over a third of effector genes most likely act outside the colonic mucosa. Our findings provide insights into colorectal oncogenesis, and highlight potential targets across tissues for new CRC treatment and chemoprevention strategies.
The data submitted here are expression and methylation models with LD reference data for the transcriptome-wide (TWAS), methylome-wide (MWAS) and transcript isoform-wide association study (TIsWAS) as described in the manuscript "Deciphering colorectal cancer genetics through multi-omic analysis of 100,204 cases and 154,587 controls of European and East Asian ancestries". Details of the methods are presented in the method section and supplementary information file.
TWAS analysis
Gene expression models for the six in-house expression datasets were generated using the PredictDB v7 pipeline for a total of 1,077 participants. Elastic net model building with 10-fold cross-validation was performed independently for each dataset. The elastic net models for GTEx v8 Colon Transverse were obtained from the PredictDB data repository (http://predictdb.org/) and had been generated using the same pipeline. Models were computed using HapMap2 SNPs ±1Mb from each gene, together with covariate factors estimated using PEER32, clinical covariates when appropriate (age, sex and, where appropriate, case-control status, type of polyp and anatomic location in the colorectum), and three PCs from the individual dataset’s SNP genotype data.
Transcript-based TWAS analyses (TIsWAS) were likewise performed by using transcript-level data from the SOCCS, BarcUVa-Seq and GTEx Colon Transverse datasets.
MWAS analysis
Methylation beta values were calculated based on the manufacturer’s standard, ranging from 0 to 1. Quality control and data normalization were performed in R using the ChAMP software pipeline for the EPIC and 450K arrays. Briefly, we filtered out failed probes with detection P > 0.02 in >5% of samples, probes with <3 reads in >5% of samples per probe and all non-CpG probes. Samples with failed probes >0.1 were also excluded from downstream analyses. We discarded all probes with SNPs within 10bp of the interrogated CpG (from 1,000 Genomes Project, CEU population)34, and probes that ambiguously mapped to multiple locations in the human genome with up to two mismatches33. We only considered probes mapping to autosomes and those overlapping between the EPIC and the 450K arrays. Normalization was achieved using the Beta MIxture Quantile (BMIQ) method. Per probe methylation models were created using the PredictDB pipeline on the normalized methylation matrix and the genotypes as per TWAS eQTL analysis. To optimize power, we restricted our analysis to 263,341-238,443 (for the 450K array) and 377,678 (for the EPIC array) probes annotated to Islands, Shores and Shelves, and discarded “Open Sea” regions.
Files
covaraiance_matrix.zip
Additional details
Related works
- Is described by
- Journal article: 10.1038/s41588-022-01222-9 (DOI)
Funding
- SYSCOL – Systems Biology of Colorectal Cancer 258236
- European Commission
- EVOCAN – Why do cancers occur where they do? A genetic and evolutionary approach. 340560
- European Commission
- CRCINTERMPHEN – FUNCTIONAL CHARACTERISATION OF COLORECTAL CANCER PREDISPOSITION GENES AND DEVELOPMENT OF INTERMEDIATE BIOMARKERS OF DISEASE RISK 301077
- European Commission