Computational recognition of protein-coding genes using multiple genomic alignments

Placeholder Show Content

Abstract/Contents

Abstract
In this thesis, I describe three main contributions I have made toward creating more accurate systems for the computational recognition of protein-coding genes. First, I present N-SCAN, a gene predictor based on a hidden Markov model that uses Bayesian networks to model multiple alignments. I also describe CONTRAST, a discriminative gene predictor based on a conditional random field and a set of support vector machines for recognizing coding region boundaries. Both N-SCAN and CONTRAST represented substantial improvements over the state-of-the-art at the time they were introduced. Additionally, I give an algorithm for training conditional random fields that maximizes an approximation to labelwise accuracy, as opposed to the usual maximum likelihood approach. This algorithm proved key to CONTRAST's success.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2010
Issuance monographic
Language English

Creators/Contributors

Associated with Gross, Samuel Solomon
Associated with Stanford University, Computer Science Department
Primary advisor Batzoglou, Serafim
Thesis advisor Batzoglou, Serafim
Thesis advisor Ng, Andrew Y, 1976-
Thesis advisor Sidow, Arend
Advisor Ng, Andrew Y, 1976-
Advisor Sidow, Arend

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Samuel Gross.
Note Submitted to the Department of Computer Science.
Thesis Ph. D. Stanford University 2010
Location electronic resource

Access conditions

Copyright
© 2010 by Samuel Solomon Gross
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...