Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning

Placeholder Show Content

Abstract/Contents

Abstract
Single-cell RNA-seq technologies enable high throughput gene expression measurement of individual cells, and allow the discovery of heterogeneity within cell populations. Measurement of cell-to-cell gene expression similarity is critical to identification, visualization and analysis of cell populations. However, single-cell data introduce challenges to conventional measures of gene expression similarity because of the high level of noise, outliers and dropouts. Here, we propose a novel similarity-learning framework, SIMLR (single-cell interpretation via multiple kernel learning), which learns an appropriate distance metric from the data for dimension reduction, clustering and visualization applications. Benchmarking against state-of-the-art methods for these applications, we used SIMLR to re-analyse seven representative single-cell data sets, including high-throughput droplet-based data sets with tens of thousands of cells. We show that SIMLR greatly improves clustering sensitivity and accuracy, as well as the visualization and interpretability of the data.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2017
Issuance monographic
Language English

Creators/Contributors

Author Wang, Bo, (Artificial intelligence scientist)
Primary advisor Batzoglou, Serafim
Thesis advisor Batzoglou, Serafim
Thesis advisor Curtis, Christina
Thesis advisor Kundaje, Anshul, 1980-
Advisor Curtis, Christina
Advisor Kundaje, Anshul, 1980-
Associated with Stanford University, Computer Science Department.

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Bo Wang.
Note Submitted to the Department of Computer Science.
Thesis Thesis (Ph.D.)--Stanford University, 2017.
Location electronic resource

Access conditions

Copyright
© 2017 by Bo Wang

Also listed in

Loading usage metrics...