Analysis and application of linkage disequilibrium in population and statistical genetics

Placeholder Show Content

Abstract/Contents

Abstract
Linkage disequilibrium (LD) is the non-random association of alleles at different genetic loci. This dissertation consists of three projects that relate to the analysis and application of LD on various topics within population and statistical genetics. Various measures of LD have been proposed in the literature, each with different arguments favoring its use. Chapter 2 employs a theoretical approach to examine mathematical properties of five different measures of LD. These results help place the use of various LD statistics into their proper contexts, and provide a mathematical basis for comparing their values. Next, the presence of LD in genomes can be leveraged for a number of different applications in statistical genetics. Chapter 3 examines one such example in genetic imputation. Specifically, we ask the question of how to optimally select a subset of a study sample for sequencing when choosing an internal reference panel for imputation, in order to maximize the eventual imputation accuracy. We compare two algorithms—maximizing phylogenetic diversity (PD) and minimizing average distance to the closest leaf (ADCL)—and conclude that while both algorithms give better imputation results as compared to randomly selecting haplotypes to be included in the reference panel, imputation accuracy is the highest when minimizing ADCL is used as the method for panel selection. Finally, LD in genomes can produce genetic signatures that may be suggestive of certain demographic processes. Genetic linkage results in the preservation of homozygous segments in the genome that are produced as the result of genomic sharing, which can then be detected as runs of homozygosity (ROH). Chapter 4 analyzes the distribution of ROH lengths in a sample of worldwide Jewish and non-Jewish populations, and employs a model-based clustering method to classify the ROH in a given population into three classes (short, intermediate, and long) based on length. Furthermore, for a subset of the Jewish populations in this study, we were able to obtain estimates of demographic rates of consanguinity (as indicated by the rates of close-relative unions). We find that the level of consanguinity in those populations is predictive of long ROH, thus finding genetic signatures of mating patterns that existed in a population's history. Making use of theoretical, computational, and statistical approaches, these chapters together provide a wide-ranging account of different aspects of LD, as related to their respective applications within the field.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2018; ©2018
Publication date 2018; 2018
Issuance monographic
Language English

Creators/Contributors

Author Kang, Teng Leng Jonathan
Degree supervisor Rosenberg, Noah
Thesis advisor Rosenberg, Noah
Thesis advisor Feldman, Marcus W
Thesis advisor Tang, Hua
Degree committee member Feldman, Marcus W
Degree committee member Tang, Hua
Associated with Stanford University, Department of Biology.

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Teng Leng Jonathan Kang.
Note Submitted to the Department of Biology.
Thesis Thesis Ph.D. Stanford University 2018.
Location electronic resource

Access conditions

Copyright
© 2018 by Teng Leng Jonathan Kang
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...