Exploring the genomic landscape of human disease

Placeholder Show Content

Abstract/Contents

Abstract
The advent of high throughput DNA sequencing has enabled the unprecedented ability to read the 3 billion As, Cs, Gs, and Ts that make up our genome. This has led to the discovery of thousands of mutations that are associated with different human diseases. The vast majority of these associations, however, remain exactly that - a single change in our DNA that is statistically associated with a disease. This dissertation describes three studies that aim to understand transcriptional regulatory logic and leverage this understanding to place human mutations in the specific biological pathways and developmental contexts that give rise to different diseases. In the first study, I investigated the evolution of gene regulation in the developing neocortex driven by transposable elements being co-opted into develop- mental enhancers. In the second study, I examined genes that were mutated in autism spectrum disorders, identified a single protein that regulates the majority of them during a specific period of brain development, and exploited this property to confirm known autism genes and prioritize others for further study. Finally, in the third study, I developed a high-specificity method to map non-protein-coding disease mutations from humans to homologous regions in the DNA of other species, where CRISPR/Cas9 genome editing was used to re-create the human disease so the effects of the mutation could be studied. The neocortex itself is a mammalian-specific structure that is responsible for higher functions such as cognition, emotion, and perception. To gain insight into its evolution and the gene regulatory codes that pattern it, in Chapter 2, I studied the overlap of its active developmental enhancers with transposable element families and compared this overlap to uniformly shuffled enhancers. I show a striking enrichment of the MER130 repeat family among active enhancers in the mouse dorsal cerebral wall, which gives rise to the neocortex, at embryonic day 14.5 (E14.5). The MER130 instances preserve a common code of transcriptional regulatory logic, function as enhancers, and are adjacent to critical neocortical genes. MER130, a non-autonomous interspersed transposable element, originates in the tetrapod or possibly Sarcopterygii ancestor, which far predates the appearance of the neocortex. These results show that MER130 elements were recruited, likely through their common regulatory logic, as neocortical enhancers. In Chapter 3, I discuss genes harboring de novo loss-of-function (LoF) variants in individuals with autism spectrum disorders (ASD) that have been identified through exome sequencing studies. Among these is TBR1, a master regulator of cortical development. ChIP-seq for TBR1 was performed during mouse cortical neurogenesis and TBR1-bound regions were enriched adjacent to ASD genes. ASD genes were also enriched among genes that are differentially expressed in Tbr1 knockouts, which together with the ChIP-seq data, suggests direct transcriptional regulation. Of the 9 ASD genes examined, 7 were mis-expressed in the cortices of Tbr1 knockout mice, including 6 with increased expression in the deep cortical layers. ASD genes with adjacent cortical TBR1 ChIP-seq peaks also showed unusually low levels of LoF mutations in a reference human population and among Icelanders. I then leveraged TBR1 binding to identify an appealing subset of candidate ASD genes. My findings highlight a TBR1-regulated network of ASD genes in the developing neocortex that are relatively intolerant to LoF mutations, indicating that these genes may play critical roles in normal cortical development. Finally, in Chapter 4, I explored the more than 90% of human disease-associated mutations that lie in non-genic regions of the genome, presenting a challenge for their interpretation. I developed a methodology to identify GWAS single nucleotide polymorphisms (SNPs) embedded in deeply conserved non-exonic DNA elements (CNEs) preserved syntenically to powerful model organisms. Applying this method to zebrafish, I found 22 CNE/SNP pairs covering a wide range of human diseases, and 5/8 were validated for enhancer activity. Strikingly, in 3/3 cases, introducing the human risk allele completely abolishes enhancer activity. rs17421627, embedded in CNE1 and associated with retinal vasculature defects, was selected for careful scrutiny. Zebrafish CNE1:EGFP transgenics revealed human CNE1 enhancer activity in the retina. CRISPR/Cas-9 deletion of CNE1 in the zebrafish genome led to defects in blood vessel development in the retina, suggesting that CNE1 is functionally associated with retinal vascular caliber in humans. Introducing the 1 base pair (bp) risk allele in the ̃1000 bp CNE1 abolished EGFP expression, indicating that SNP rs17421627 is likely the causal mutation responsible for the human phenotype. Since its discovery, rs17421627 was thought to regulate MEF2C. Here we show that CNE1/rs17421627 regulates the neurogenesis modulator, microRNA-9. Consistent with the GWAS studies, miR-9 depletion leads to retinal vasculature defects, demonstrating that miR-9 is the conserved gene functionally associated with retinal blood vessel development. This study validates the in silico approach to identify conserved enhancers containing GWAS SNPs and demonstrates how the combination of human and zebrafish genetics reveals enhancer function, the regulatory activity of the SNP, the cis-regulated gene, and the biological processes disrupted in the associated disorders. Similar screens can be performed in valuable model organisms like Medaka and Xenopus to provide a growing trove of cis-targets for in vivo studies. Together, these studies demonstrate how genomics can extend our understanding of different diseases by offering strategies for combining human genetics and developmental biology to reveal their precise mechanism of action.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2016
Issuance monographic
Language English

Creators/Contributors

Associated with Notwell, James Henry
Associated with Stanford University, Department of Computer Science.
Primary advisor Bejerano, Gill, 1970-
Thesis advisor Bejerano, Gill, 1970-
Thesis advisor Dill, David L
Thesis advisor McConnell, Susan K
Thesis advisor Mourrain, Philippe
Advisor Dill, David L
Advisor McConnell, Susan K
Advisor Mourrain, Philippe

Subjects

Genre Theses

Bibliographic information

Statement of responsibility James Henry Notwell.
Note Submitted to the Department of Computer Science.
Thesis Thesis (Ph.D.)--Stanford University, 2016.
Location electronic resource

Access conditions

Copyright
© 2016 by James Henry Notwell
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...