Unraveling the genetics of disease using structured probabilistic models

Battle, Alexis Jane; Stanford University, Computer Science Department.

Unraveling the genetics of disease using structured probabilistic models

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fxx199ts2959" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Recent technological advances have allowed us to collect genomic data on an unprecedented scale, with the promise of revealing genetic variants, genes, and pathways disrupted in clinically relevant human traits. However, identifying functional variants and ultimately unraveling the genetics of complex disease from such data have presented significant challenges. With millions of genetic factors to consider, spurious associations and lack of statistical power are major hurdles. Further, we cannot easily assess functional roles even for known trait-associated variants, particularly for those that lie outside of protein-coding regions of the genome. To address these challenges in identifying the genetic factors underlying complex traits, we have developed probabilistic machine learning methods that leverage biological structure and prior knowledge. In this thesis, we describe four applications of such models. First, we present a method for reconstructing causal gene networks from interventional genetic interaction data in model organisms. Here, we are able to identify intricate functional dependencies among hundreds of genes affecting a complex trait. We have applied this method to understanding the genetics of protein folding in yeast, where we demonstrate ability to recapitulate the details, including ordering, of known pathways, and make novel functional predictions. Second, we present PriorNet, a method for incorporating gene network and path- way information into the analysis of population-level studies of genetic variation in human disease. PriorNet utilizes a flexible, Markov Random Field prior to propagate information between functionally related genes and related diseases, in order to improve statistical power in large-scale disease studies. We demonstrate a significant improvement in the discovery of disease-relevant genes in studies of three autoimmune diseases. Next, we extend the intuitions of PriorNet in a method for identifying interactions between genetic variants in human disease, to begin to understand how genes work together in complex disease processes. Our method, GAIT, leverages gene networks, network structure, and other patterns to adaptively prioritize candidate in- teractions for testing, and dramatically reduce the burden of multiple hypothesis correction to identify a large number of interactions in diverse human disease studies. Finally, we discuss the identification of functional variants on a large scale through the use of gene expression as a high-resolution cellular phenotype. We have sequenced RNA from 922 genotyped individuals to provide a direct window into the distribution, properties, and consequences of thousands of regulatory variants affecting diverse gene expression traits including splicing and allelic expression. From the identified variants, we also train a model, LRVM, for predicting regulatory consequences based on location and genomic properties of each variant.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2013
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Battle, Alexis Jane
Associated with	Stanford University, Computer Science Department.
Primary advisor	Koller, Daphne
Thesis advisor	Koller, Daphne
Thesis advisor	Batzoglou, Serafim
Thesis advisor	Levinson, Douglas
Advisor	Batzoglou, Serafim
Advisor	Levinson, Douglas

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	Alexis Jane Battle.
Note	Submitted to the Department of Computer Science.
Thesis	Thesis (Ph.D.)--Stanford University, 2013.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...