Genome level analysis of protein evolution in Drosophila

Placeholder Show Content

Abstract/Contents

Abstract
Recent advances in technology have unleashed a breathtaking amount of genome sequence data. My doctoral research builds on the resulting opportunities of whole genome sequence data and analysis in a quest for gaining new insights into molecular evolution, and in particular molecular evolution and adaptation of proteins. Chapter One of my dissertation is focused on methodology, specifically the impact that the choice of multiple sequence alignment procedure has on the inference of positive selection. I find that selection inference is highly dependent on the choice of alignment procedure. Furthermore, in the case of the 12 Drosophila genomes proteins and commonly used alignment programs, most of the inferences are false positives, caused by misaligned codons. These results put into question the reliability of some of the previously reported conclusions on adaptation in these proteins, as well as in similar studies of other species. The second chapter relates these alignment errors to specific regions in the proteins, the so-called intrinsically disordered protein regions. Disordered protein regions do not have a stable fold in native physiological conditions, and therefore do not fit the standard assumptions made in codon-based evolutionary analyses about structure and folding constraints. My findings indicate that in the case of the false-positively inferred positively selected sites, as well as in general throughout the alignments, most alignment errors and ambiguities originate in disordered regions. Indels in the alignments also appear predominantly in those regions. Examination of widely used alignment benchmarks indicates that disordered regions have mostly been excluded from the benchmarks. I discuss implications of these findings for both evolutionary research and development of alignment software. The last chapter of my dissertation characterizes the evolution of disordered regions in Drosophila. I survey protein substitution and polymorphism patterns in D. melanogaster and D. simulans, and find that relative to structured regions, disordered ones have multiple-fold higher rates of replacement polymorphisms and substitutions, as well as fewer rare polymorphisms. Structured regions appear to be significantly more impacted by both positive and purifying selection, contrary to indications from previous reports where within-species data was not incorporated in the analysis.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2011
Issuance monographic
Language English

Creators/Contributors

Associated with Markova-Raina, Penka Vassileva
Associated with Stanford University, Department of Biology.
Primary advisor Petrov, Dmitri Alex, 1969-
Thesis advisor Petrov, Dmitri Alex, 1969-
Thesis advisor Batzoglou, Serafim
Thesis advisor Feldman, Marcus W
Advisor Batzoglou, Serafim
Advisor Feldman, Marcus W

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Penka Vassileva Markova-Raina.
Note Submitted to the Department of Biology.
Thesis Ph.D. Stanford University 2011
Location electronic resource

Access conditions

Copyright
© 2011 by Penka Vassileva Markova-Raina

Also listed in

Loading usage metrics...