Publication | Open Access
The human splicing code reveals new insights into the genetic determinants of disease
1.3K
Citations
70
References
2014
Year
GeneticsRna SplicingGenetic EpidemiologyPathologyMachine-learning TechniqueMolecular GeneticsNew InsightsGenomicsDisease Gene IdentificationSplicing VariantGenetic AnalysisTranscriptional RegulationHuman Splicing CodePublic HealthMolecular DiagnosticsSplice Site AlterVariant InterpretationPersonal GenomicsBioinformaticsFunctional GenomicsGenetic DeterminantsExonic VariantsComplex DiseaseSystems BiologyMedicine
The authors developed a machine‑learning method that scores the impact of genetic variants on RNA splicing to support precision medicine and genome annotation. The model was trained on over 650,000 intronic and exonic variants and applied to whole‑genome sequencing data from individuals with autism to identify misspliced genes associated with neurodevelopmental phenotypes. Analysis of the data revealed that disease‑associated intronic mutations more than 30 nucleotides from splice sites and low‑impact missense exonic mutations are nine‑ and five‑fold more likely to alter splicing, respectively, uncovering tens of thousands of pathogenic variants—including those implicated in cancers and spinal muscular atrophy—and providing evidence that the approach can pinpoint causal variants for precision medicine.
To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.
| Year | Citations | |
|---|---|---|
Page 1
Page 1