Multi-scale data integration frameworks for predicting outcomes in cancer

Placeholder Show Content

Abstract/Contents

Abstract
Cancer research abounds with multi-scale data, from imaging to multi-modal molecular data, such as genomic, epigenomic, transcriptomic, and proteomic. Prediction models of clinical outcomes, including survival and therapeutic response, could capitalize on the richness of information that the data embody. In practice, however, the lack of effective methods for data integrative analysis leaves much of the latent knowledge untapped. For example, imaging data are routinely obtained for diagnostic purposes, but often underutilized in integrative analysis of cancer outcomes. By establishing inter-data correlations, imaging data have the potential to become noninvasive proxies for biopsy-acquired molecular data. Furthermore, traditional methods of data analysis have limited ability to extract knowledge from multi-scale data, which are large, heterogeneous, and exhibit complex inter-data interactions. Yet, in practice, most data integration efforts embrace outmoded methods, which limit analytic capabilities to a small number of datasets and do not accommodate different data types. In this dissertation, I outline specific approaches to enhance knowledge extraction through integrative analyses that: (1) directly relate imaging data to molecular data, and (2) provide biomedical decision support (prediction of clinical outcomes) from multi-scale data. I applied these approaches, embodied in two frameworks, to the analysis of brain cancers: glioblastoma (GBM) and low grade glioma (LGG). The frameworks were designed to improve upon current standards of data analysis and enhance extraction of knowledge by incorporating machine learning techniques to boost information capture from each data source, adapting dedicated strategies for the integration of multiple high-dimensional datasets, and including rigorous evaluation and validation strategies to ensure robust performance. Using the first framework, I identified three novel image-based GBM subtypes and showed that each subtype not only confers differential survival probabilities, but also embodies a unique set of differentially regulated signaling pathways, which could potentially be targeted for therapeutic effect. The use of imaging data to infer molecular information, including potential therapies, without biopsy supports the role of quantitative image features as noninvasive surrogates of underlying molecular activity. Using the second framework, I developed prediction models of survival built on specific strategies of integrating multi-omics data that outperformed models built on current standards of analysis. It generated predictive markers of survival, a mix of previously unknown and known molecular entities with active or putative oncogenic roles, in GBM and LGG. The application of this framework in other cancers is anticipated to facilitate both novel biomarker discovery and biomedical decision support for a variety of clinical outcomes, including treatment response and risk of recurrence.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2016
Issuance monographic
Language English

Creators/Contributors

Associated with Itakura, Haruka
Associated with Stanford University, Program in Biomedical Informatics.
Primary advisor Gevaert, Olivier Michel Simonne
Thesis advisor Gevaert, Olivier Michel Simonne
Thesis advisor Altman, Russ
Thesis advisor Mitchell, Beverly S
Advisor Altman, Russ
Advisor Mitchell, Beverly S

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Haruka Itakura.
Note Submitted to the Program in Biomedical Informatics.
Thesis Thesis (Ph.D.)--Stanford University, 2016.
Location electronic resource

Access conditions

Copyright
© 2016 by Haruka Itakura
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...