Analysis of next generation sequencing data in two complementary UK populations with congenital heart disease

Student thesis: Phd

Abstract

Background: The genetic aetiology of congenital heart disease (CHD) is poorly understood. A large number of genes and copy number variants which cause syndromic and non-syndromic CHD have already been identified but the majority of cases remain unexplained. The 100,000 Genomes Project (100KGP) conducted whole genome sequencing for UK patients with rare disease or cancer including a number with CHD. This project aimed: (1) to assess the contribution of variation in known CHD associated genomic elements to CHD aetiology in the both 100KGP and an in-house CHD cohort; and (2) to identify new CHD candidate genes which have a high number of deleterious variants in these cohorts. Methods: Clinical and phenotypic data from the 100KGP was interrogated to identify a cohort of participants with CHD. Whole genome sequencing from 2638 individuals with CHD in this cohort, and whole exome sequencing for 1444 individuals from an in-house CHD cohort were analysed to identify deleterious variants using predictive tools including VEP, CADD and gnomAD. Results: 2638 individuals with CHD were identified from the 100KGP, of whom 88% of the cohort have "CHD-plus", defined here as CHD accompanied by extra-cardiac abnormalities and/or neurodevelopmental delay. Fewer than 6% of this cohort have either a deleterious variant in a known non-syndromic CHD gene (rated "green" in the Genomics England PanelApp, and routinely screened for reportable variants during the 100KGP), or a known CHD-causing copy number variant. In comparison, 18% of 100KGP CHD cases have a deleterious variant in a gene which causes a recognised genetic syndrome associated with CHD. Interrogation of de novo variants in sporadic CHD cases identified a significant overrepresentation of genes associated with syndromic CHD, alongside novel CHD candidate genes including HNRNPU and AMOTL1. Case-control analysis identified a significantly higher burden of deleterious variants in CHD cases than controls in ABL1 and RHOT1. CHD cases in the 100KGP also have a significantly higher burden of uncommon genic deletions and duplications than controls; in particular, large deletions which are most likely to be pathogenic. The in-house CHD cohort is comprised primarily of isolated CHD cases and has a similar proportion of cases with deleterious variants in PanelApp green genes as the 100KGP cohort, but a lower proportion have potentially causative variants in syndromic CHD genes. Clustering analysis of the in-house cohort reveals five genes with a significantly higher burden of deleterious variants than expected: NOTCH1, FLT4, TTN, MYH8 and CEL. Conclusions: Genetic syndromes are under-recognised among patients with CHD-plus and the diagnostic yield of the 100KGP could be significantly increased by widening the list of genes screened to include syndromic CHD genes. This study has also identified a number of novel CHD candidate genes, as well as a significant burden of deleterious copy number variants, which may help explain some of the "missing heritability" of CHD.
Date of Award1 Aug 2022
Original languageEnglish
Awarding Institution
  • The University of Manchester
SupervisorAndrew Sharrocks (Supervisor), Bernard Keavney (Supervisor) & Kathryn Hentges (Supervisor)

Keywords

  • whole exome sequencing
  • congenital heart disease
  • next generation sequencing
  • 100,000 genomes project
  • cardiovascular
  • whole genome sequencing
  • genetics

Cite this

'