Algorithms for analyzing third generation sequencing data

Placeholder Show Content

Abstract/Contents

Abstract
Sequencing technologies have been evolved since the first human genome was released in 2003. First generation sequencing emerged in the mid-1970s, generating reads with high accuracy. In the mid-2000s, second generation sequencing was commercially available with higher throughput and lower cost per base. In recent years, the third generation sequencing technologies can generate reads of multi-ten kilobases long, though with a relatively lower accuracy. These current unique characteristics of third generation sequencing data, longer sequences and higher error rate, have required development of new algorithms and tools to efficiently process data. In this thesis, we present two methods in analyzing third generation sequencing reads. The first method (COSINE) is a conceptually novel technique for mapping long DNA sequences with high error rates. As a proof-of-concept, COSINE is applied to both simulated and real datasets where it achieves high sensitivity and specificity in wide range of read accuracies with minimal tuning. The second method (IDP-fusion) is a new approach to accurately characterize fusion genes using hybrid RNA sequencing. As a proof-of-concept, IDP-fusion is applied to PacBio and Illumina real datasets from the MCF-7 cell line, where it achieves higher sensitivity and specificity compared to existing tools. The results also show that IDP-fusion could resolve multiple fusion splices and fusion isoforms within tumorigenesis-relevant fusion genes.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2016
Issuance monographic
Language English

Creators/Contributors

Associated with Tootoonchi Afshar, Pegah
Associated with Stanford University, Department of Electrical Engineering.
Primary advisor Wong, Wing Hung
Thesis advisor Wong, Wing Hung
Thesis advisor Tse, David
Thesis advisor Weissman, Tsachy
Advisor Tse, David
Advisor Weissman, Tsachy

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Pegah Tootoonchi Afshar.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2016.
Location electronic resource

Access conditions

Copyright
© 2016 by Pegah Tootoonchi Afshar
License
This work is licensed under a Creative Commons Attribution Non Commercial Share Alike 3.0 Unported license (CC BY-NC-SA).

Also listed in

Loading usage metrics...