Data-driven identification of stratifying cellular signatures in high-dimensional cytometry data

Placeholder Show Content

Abstract/Contents

Abstract
Recent work in the fields of cancer research, stem cell biology, and immunology have highlighted the role of specific cellular subsets in disease development and progression. However, many high-throughput assays (i.e. microarrays) used to study diseases of the cell quantify molecular targets in bulk tissue samples, thereby obfuscating signal from functionally distinct cellular subsets. Single-cell measurement technologies such as flow cytometry quantify levels of molecular targets and other cellular properties in single cells, thereby enabling the identification, isolation, and investigation of specific cellular subpopulations. Currently, fluorescence-based flow cytometers permit the concurrent measurement of 12 to 17 molecular targets per cell, thus greatly restricting flow cytometry's utility in screening and target discovery studies. Recent instrumentation advancements, specifically mass cytometry, have increased the number of parameters that are simultaneously measurable using flow cytometry to over 40. Additionally, multiplexing techniques such as mass tag cell barcoding have enabled the concurrent measurement of samples subjected to dozens of experimental conditions. While technological and protocol advancements have now made flow cytometry an attractive assay for screening and discovery, methods for analyzing resulting data have not made equivalent progress. Specifically, hypothesis-driven manual analyses remain standard practice for analyzing flow cytometry data, yet these techniques are biased and do not scale to higher-dimensional data or experiments with many samples. There is a clear need for methods that facilitate scalable and thorough investigation of high-dimensional single cell datasets. The central theme of this dissertation is the development, evaluation, and application of data-driven methods for analyzing high-dimensional, single cell measurements. Among other contributions, we presents a novel method, termed Citrus, that identifies relevant cell subsets in multidimensional flow cytometry data. Cell subsets are identified by Citrus using a novel objective, experimental-relevance, as opposed to many existing automated methods that are optimized for reproducing the results of manual population identification efforts. We start by showing Citrus to be sensitive identifier of manually identified populations in five publicly available datasets. Next, Citrus' capacity to identify experimentally-informative cellular subsets is confirmed by identifying and validating blood cells that respond to B cell receptor cross-linking. Citrus' performance is compared to an existing data-driven method wherein we find our methods are better able to identify rare prognostic cell subsets in HIV-infected patients. Finally, we close with a proposed application of Citrus to a novel dataset in which we aim to evaluate the immune system's role and response to surgically induced tissue trauma. Importantly, our methods scale well to higher dimensional datasets with many patients and require no knowledge of existing cellular subsets. Furthermore, these methods are easily applied in diverse experimental settings, and provide interpretable results that facilitate hypothesis generation. Thus we anticipate that the methods presented here will aid investigators in the performance of unbiased, and potentially more thorough investigation of complex, high-dimensional single-cell datasets.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2014
Issuance monographic
Language English

Creators/Contributors

Associated with Bruggner, Robert Vernon, Jr
Associated with Stanford University, Program in Biomedical Informatics.
Primary advisor Dill, David L
Primary advisor Nolan, Garry P
Thesis advisor Dill, David L
Thesis advisor Nolan, Garry P
Thesis advisor Tibshirani, Robert
Advisor Tibshirani, Robert

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Robert Vernon Bruggner Jr.
Note Submitted to the Program in Biomedical Informatics.
Thesis Thesis (Ph.D.)--Stanford University, 2014.
Location electronic resource

Access conditions

Copyright
© 2014 by Robert Vernon Bruggner
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...