A computational approach to identification and comparison of cell subsets in flow cytometry data

Placeholder Show Content

Abstract/Contents

Abstract
Changes in frequency and/or biomarker expression in small subsets of peripheral blood cells provide key diagnostics for disease presence, status and prognosis. At present, flow cytometry instruments that measure the joint expression of up to 20 markers in/on large numbers of individual cells are used to measure surface and internal marker expression. This technology is routinely used to determine the frequencies of various marker-defined cell subsets in patient samples and is often used to inform therapeutic decision-making. Nevertheless, quantitative methods for comparing data between samples are sorely lacking. There are no reliable computational methods for determining the magnitude of differences among samples from different patients, among samples obtained from the same patient on different days, or between aliquots of the same sample measured before and after response to stimulation or other treatment. This thesis describes novel computational methods that provide reliable indices of change in subset representation and/or marker expression by individual subsets of cells. The methods we have developed utilize a non-parametric clustering algorithm, Density-Based Merging (DBM), that we developed to identify subsets (clusters) of cells that express a common set of markers measured independently for each cell by flow cytometry. To quantitate differences between these subsets, we introduce the application of Earth Movers Distance (EMD), an algorithm used to compare multivariate distributions borrowed from the image retrieval literature. The resultant methods are highly sensitive and reliable for identifying small marker expression differences between subset of cells in flow cytometry data sets. We show that these methods are easily applied and readily interpreted. Importantly, we demonstrate their practical utility with data from an allergy study in which the expression of two markers on very rare blood cells (basophils) in response to stimulation with an offending allergen indicates whether the patient is allergic to the stimulating antigen. In addition, we have developed novel evaluation criteria for assessing the performance of clustering algorithms on flow cytometry data by combining mixtures of cells identifiable by dimensions ``hidden'' from the algorithm that provide true cluster membership. Thus, we expect that the methods described here will introduce a new approach to using flow cytometry to measure biomarker changes as indices of drug response, disease susceptibility, disease progress and prognosis.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2011
Issuance monographic
Language English

Creators/Contributors

Associated with Zimmerman, Noah
Associated with Stanford University, Program in Biomedical Informatics.
Primary advisor Das, Amar K. (Amar Kumar)
Primary advisor Walther, Guenther
Thesis advisor Das, Amar K. (Amar Kumar)
Thesis advisor Walther, Guenther
Thesis advisor Herzenberg, Leonore A
Advisor Herzenberg, Leonore A

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Noah Zimmerman.
Note Submitted to the Program in Biomedical Informatics.
Thesis Thesis (Ph.D.)--Stanford University, 2011.
Location electronic resource

Access conditions

Copyright
© 2011 by Noah Zimmerman
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...