Skip to main navigation Skip to search Skip to main content

Transcript expression-aware annotation improves rare variant interpretation

  • Genome Aggregation Database Production Team
  • , Genome Aggregation Database Consortium
  • The Broad Institute of MIT and Harvard
  • Massachusetts General Hospital
  • Harvard Medical School
  • University Hospital Southampton NHS Foundation Trust
  • European Bioinformatics Institute
  • Boston Children's Hospital
  • Wellcome Sanger Institute
  • National Heart and Lung Institute
  • and Royal Brompton and Harefield NHS Trust
  • Inst. Nac. de Cie. Med./Nutr. S. Z.
  • Peninsula College of Medicine and Dentistry University of Exeter
  • Brigham and Women's Hospital
  • University Hospital of Parma
  • University of Haifa
  • Albert Einstein College of Medicine of Yeshiva University
  • Cleveland Clinic Foundation
  • UPMC Univ Paris
  • Framingham Heart Study
  • Boston University School of Medicine
  • Boston University School of Public Health
  • University of Michigan School of Public Health
  • National Human Genome Research Institute (NHGRI)
  • Mount Sinai School of Medicine
  • Wake Forest School of Medicine
  • University of Leicester
  • Glenfield Hospital
  • Imperial College London
  • Ealing Hospital NHS Trust
  • Imperial College Healthcare NHS Trust
  • Chinese University of Hong Kong
  • McLean Hospital
  • University of Mississippi Medical Center
  • Colorado School of Public Health
  • UIC ECE-CSN-Lab
  • Texas Biomedical Research Institute
  • Hospital Del Mar-Instituto Municipal de Asistencia Sanitaria (IMAS)
  • Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBERCV)
  • University of Vic-Central University of Catalonia
  • University of Lübeck
  • Partner Site Munich Heart Alliance
  • Universitätsklinikum Schleswig-Holstein Campus Lübeck
  • University of Tartu
  • Helsinki University Central Hospital
  • Christian-Albrechts-University of Kiel
  • Hadassah Hebrew University Medical Center
  • SUNY Upstate Medical University
  • Columbia University Irving Medical Center
  • Instituto Nacional de Salud Publica
  • Lund University
  • Lund University Diabetes Centre
  • The University of Texas Health Science Center at Houston
  • Columbia University
  • University of Kuopio
  • Karolinska Institutet
  • University of Helsinki
  • Korea National Institute of Health
  • Cardiff University School of Medicine
  • National Institute for Health and Welfare
  • Yale University Medical School
  • Emory University School of Medicine
  • Seoul National University Hospital
  • Kuopion Yliopistollinen sairaala
  • Tampere University
  • University of New South Wales
  • Murdoch Children’s Research Institute and University of Melbourne Department of Paediatrics
  • University of Oxford Medical Sciences Division
  • University of Oxford
  • Oxford University Hospitals NHS Foundation Trust
  • Cedars-Sinai Medical Center
  • University of Ottawa Heart Institute
  • University Hospital Malmö
  • Instituto Nacional de Medicina Genómica
  • Ninewells Hospital and Medical School
  • Graduate School of Convergence Science and Technology
  • Keck School of Medicine of USC
  • Johns Hopkins School of Medicine
  • Institute of Cancer Research
  • University of Oulu
  • H1 T 1C8
  • Université de Montréal, Faculté de Médecine
  • University of Helsinki
  • Vanderbilt University Medical Center
  • The University of Pennsylvania
  • University of Pennsylvania
  • Center for Non-Communicable Diseases
  • Vanderbilt School of Medicine
  • King's College London
  • University of North Carolina
  • National University of Singapore
  • Department of Medicine
  • Duke-NUS Medical School
  • Folkhälsan Institute of Genetics
  • Department of Psychiatry
  • University of California
  • The Hebrew University of Jerusalem
  • UNAM
  • University Medical Center Groningen

Research output: Contribution to journalArticlepeer-review

139 Scopus citations

Abstract

The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the ‘proportion expressed across transcripts’, which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.

Original languageEnglish
Pages (from-to)452-458
Number of pages7
JournalNature
Volume581
Issue number7809
DOIs
StatePublished - 28 May 2020

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Fingerprint

Dive into the research topics of 'Transcript expression-aware annotation improves rare variant interpretation'. Together they form a unique fingerprint.

Cite this