Learning human history from sequenced Y chromosomes

Poznik, G. David; Stanford University, Program in Biomedical Informatics.

Learning human history from sequenced Y chromosomes

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fyr644bf9699" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: The Y chromosome harbors the longest stretch of non-recombining DNA in the human genome--by orders of magnitude. Consequently, the molecule a man transmits to his son bears a record of mutations that occurred in their paternal-line ancestors. We can therefore utilize variation within a sample to build a detailed phylogenetic tree that is a rich source of information about historical migrations and extant population structure. Two decades of scholarship have revealed many aspects of the tree's topology and the geographic distribution of its clades, but little was known about its branch lengths. The advent of high-throughput sequencing unlocked the chromosome's full potential as an evolutionary marker. With full sequences, we could, in principle, discover variants free of ascertainment bias and with great sensitivity, but the Y chromosome's uniquely complex structure presented challenges for short-read sequencing analysis. In this dissertation, I describe my work developing methods to analyze, interpret, and extract insight from Y-chromosome sequences. Upon segmenting the chromosome to delineate regions amenable to short-read sequencing and developing a pipeline to reliably call genotypes, I characterized the full structure of the tree and used its branch lengths to estimate split times. I first applied these methods to a collection of 69 individuals sampled from nine globally diverse populations and to a study of the phylogenetic and geographic structure of a common yet poorly characterized clade. Second, I extended these methods to gain insight from ancient-DNA (aDNA) specimens. To estimate a split time related to the initial colonization of the Americas, I utilized missing evolution on the lineage of a 12,600-year-old individual buried in direct association with Clovis tools, implementing a Poisson process model for mutations on the tree. I also analyzed the Y-chromosome sequence of Kennewick Man, a 9,000-year-old individual whose population affinities had been the subject of scientific debate and legal controversy. Finally, I used Y-chromosome sequence data to help identify the likely origin of a 17th-century enslaved African whose remains were excavated in the Caribbean, and I developed a method to leverage the Y-chromosome phylogeny to estimate the genotyping error rate. I then scaled the methods I had developed in order to apply them to two large-scale sequencing projects. In the third part of this dissertation, I detail my analysis for the Y-chromosome subgroup of the 1000 Genomes Project, whose sample includes 1,244 males from 26 populations. To conclude, I discuss an ongoing effort to investigate the population history of Africa by capturing and sequencing the Y chromosomes of several hundred Africans sampled from diverse populations across the continent.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2015
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Poznik, G. David
Associated with	Stanford University, Program in Biomedical Informatics.
Primary advisor	Bustamante, Carlos
Thesis advisor	Bustamante, Carlos
Thesis advisor	Pritchard, Jonathan D
Thesis advisor	Tang, Hua
Advisor	Pritchard, Jonathan D
Advisor	Tang, Hua

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	G. David Poznik.
Note	Submitted to the Program in Biomedical Informatics.
Thesis	Thesis (Ph.D.)--Stanford University, 2015.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...