Using family sequencing data to understand sequencing errors, meiotic crossovers, and disease risk

Placeholder Show Content

Abstract/Contents

Abstract
Despite widespread sequencing efforts, the genetic etiologies of many complex diseases remain poorly understood. In this work, we show that family-based linkage methods, when adapted to handle the large sample sizes and dense marker sets that are now available, provide an opportunity to explore the disease impact of variants with low linkage disequilibrium with their neighbors, such as rare variants. Linkage methods do not rely on linkage disequilibrium in a population, and instead exploit genetic inheritance in families to identify risk regions. We have developed a series of methods that use large cohorts of family-based sequencing datasets to better understand sequencing error rates, meiotic crossovers, and disease risk. First, we show that familial relationships can be leveraged to estimate sample-level estimates of sequencing error rates. Next, we develop a hidden Markov model that identifies meiotic crossovers, shared genetic material between siblings, and inherited deletions in families. Finally, we develop a genome-wide sibling-pair linkage test which leverages sibling IBD to identify genomic regions harboring risk variants. This method not only increases detection power for rare risk variants, but also enables the use of microarrays which are widely and affordably available in the consumer market. Applying our method to crowdsourced autism families who have taken Ancestry.com DNA tests, we identify two significant autism risk regions which we validate with a separate and independent microarray dataset. We show how turning our attention to families provides the power to uncover key events in the genome that cannot be detected otherwise. This thesis provides a framework for extending family-based linkage analysis into the era of next-generation sequencing in order to increase our understanding of genetic risk factors for complex diseases

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2023; ©2023
Publication date 2023; 2023
Issuance monographic
Language English

Creators/Contributors

Author Paskov, Kelley Marie
Degree supervisor Wall, Dennis Paul
Thesis advisor Wall, Dennis Paul
Thesis advisor Hastie, Trevor
Thesis advisor Sabatti, Chiara
Degree committee member Hastie, Trevor
Degree committee member Sabatti, Chiara
Associated with Stanford University, School of Medicine
Associated with Stanford University, Department of Biomedical Informatics

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Kelley Paskov
Note Submitted to the Department of Biomedical Informatics
Thesis Thesis Ph.D. Stanford University 2023
Location https://purl.stanford.edu/zd828ty8201

Access conditions

Copyright
© 2023 by Kelley Marie Paskov
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...