Computational recognition of protein-coding genes using multiple genomic alignments

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fnj375qg7306" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: In this thesis, I describe three main contributions I have made toward creating more accurate systems for the computational recognition of protein-coding genes. First, I present N-SCAN, a gene predictor based on a hidden Markov model that uses Bayesian networks to model multiple alignments. I also describe CONTRAST, a discriminative gene predictor based on a conditional random field and a set of support vector machines for recognizing coding region boundaries. Both N-SCAN and CONTRAST represented substantial improvements over the state-of-the-art at the time they were introduced. Additionally, I give an algorithm for training conditional random fields that maximizes an approximation to labelwise accuracy, as opposed to the usual maximum likelihood approach. This algorithm proved key to CONTRAST's success.

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2010
Issuance	monographic
Language	English

Associated with	Gross, Samuel Solomon
Associated with	Stanford University, Computer Science Department
Primary advisor	Batzoglou, Serafim
Thesis advisor	Batzoglou, Serafim
Thesis advisor	Ng, Andrew Y, 1976-
Thesis advisor	Sidow, Arend
Advisor	Ng, Andrew Y, 1976-
Advisor	Sidow, Arend

Genre	Theses

Statement of responsibility	Samuel Gross.
Note	Submitted to the Department of Computer Science.
Thesis	Ph. D. Stanford University 2010
Location	electronic resource

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

View in SearchWorks

Loading usage metrics...