Computational pathology for genomic medicine
- The medical specialty of pathology is focused on the transformation of information extracted from patient tissue samples into biologically informative and clinically useful diagnoses to guide research and clinical care. Since the mid-19th century, the primary data type used by surgical pathologists has been microscopic images of hematoxylin and eosin stained tissue sections. Over the past several decades, molecular data have been increasingly incorporated into pathological diagnoses. There is now a need for the development of new computational methods to systematically model and integrate these complex data to support the development of data-driven diagnostics for pathology. The overall goal of this dissertation is to develop and apply methods in this new field of Computational Pathology, which is aimed at: 1) The extraction of comprehensive integrated sets of data characterizing disease from a patient's tissue sample; and 2) The application of machine learning-based methods to inform the interpretation of a patient's disease state. The dissertation is centered on three projects, aimed at the development and application of methods in Computational Pathology for the analysis of three primary data types used in cancer diagnostics: 1) morphology; 2) biomarker expression; and 3) genomic signatures. First, we developed the Computational Pathologist (C-Path) system for the quantitative analysis of cancer morphology from microscopic images. We used the system to build a microscopic image-based prognostic model in breast cancer. The C-Path prognostic model outperformed competing approaches and uncovered the prognostic significance of several novel characteristics of breast cancer morphology. Second, to systematically evaluate the biological informativeness and clinical utility of the two most commonly used protein biomarkers (estrogen receptor (ER) and progesterone receptor (PR)) in breast cancer diagnostics, we performed an integrative analysis over publically available expression profiling data, clinical data, and immunohistochemistry data collected from over 4,000 breast cancer patients, extracted from 20 published studies. We validated our findings on an independent integrated breast cancer dataset from over 2,000 breast cancer patients in the Nurses' Health Study. Our analyses demonstrated that the ER-/PR+ disease subtype is rare and non-reproducible. Further, in our genomewide study we identified hundreds of biomarkers more informative than PR for the stratification of both ER+ and ER- disease. Third, we developed a new computational method, Significance Analysis of Prognostic Signatures (SAPS), for the identification of robust prognostic signatures from clinically annotated Omics data. We applied SAPS to publically available clinically annotated gene expression data obtained from over 3,800 breast cancer patients from 19 published studies and over 1,700 ovarian cancer patients from 11 published studies. Using these two large meta-datasets, we applied SAPS and performed the largest analysis of subtype-specific prognostic pathways ever performed in breast or ovarian cancer. Our analyses led to the identification of a core set of prognostic biological signatures in breast and ovarian cancer and their molecular subtypes. Further, the SAPS method should be generally useful for future studies aimed at the identification of biologically informative and clinically useful signatures from clinically annotated Omics data. Taken together, these studies provide new insights into the biological factors driving cancer progression, and our methods and models will support the continuing development of the field of Computational Pathology.
|Type of resource
|electronic; electronic resource; remote
|1 online resource.
|Beck, Andrew Hanno
|Stanford University, Department of Biomedical Informatics.
|Butte, Atul J
|Rijn, Matt van de
|Butte, Atul J
|Rijn, Matt van de
|Statement of responsibility
|Andrew Hanno Beck.
|Submitted to the Department of Biomedical Informatics.
|Ph.D. Stanford University 2013
- © 2013 by Andrew Hanno Beck
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...