Empowering disease diagnostics and deriving biological insight with publicly available gene expression data

Placeholder Show Content

Abstract/Contents

Abstract
Public data repositories and other data sharing platforms have been a massive boon to researchers in the biomedical sciences. Data sharing can reduce costs, save valuable time, and helps ensure the transparency of public research. In our lab, these public data have become the foundation of most of our analyses -- many projects would not be possible without access to a wide array of datasets comprising many different diseases, regions, ethnicities, medical histories, and so on. Furthermore, as we have repeatedly demonstrated, the robustness that we can achieve from integrating large amount of heterogeneous data has allowed us to make findings that are both significant and durable. However, designing projects around public data can be a double-edged sword. When relevant data is available, it can allow for more efficient and more powerful analyses. But when the data is not present or lacking in quality, it can impose severe limitations on the types of analyses that can be done and on the types of questions that can be asked. Here, we designed a novel multi-cohort analysis framework called Multicohort ANalysis of AggregaTed gEne Expression (MANATEE) to integrate large numbers of gene expression datasets for use in generating signatures of disease. MANATEE utilizes a conormalization method to pool samples across many datasets, as long as each dataset contains healthy control samples. This framework lets us utilize far more datasets than was previously possible, which not only allows for existing analyses to be made substantially more robust, but it also opens new avenues of exploration that were previously impossible to analyze using publicly available data. By utilizing MANATEE with publicly available gene expression datasets, we developed multiple host-response-based signatures of disease, all of which were derived from gene expression in human blood. These include a diagnostic for differentiating between bacterial and viral infection in febrile individuals, a prognostic for assessing whether a patient with viral infection will have a severe or mild outcome, and a signature for both distinguishing tuberculosis from other conditions and for predicting when patients with latent tuberculosis infection will progress to active tuberculosis. We validated each of these signatures in prospective cohorts, demonstrating their ability to generalize to new data. These results highlight MANATEE's ability to leverage public data in creating signatures that maintain performance across the heterogeneity present in real-world patient populations. Furthermore, the signatures we have developed are in the process of being translated into point-of-care, non-invasive diagnostic and prognostic tests which have the potential to significantly improve clinical practice.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Rao, Aditya Manohar
Degree supervisor Khatri, Purvesh
Thesis advisor Khatri, Purvesh
Thesis advisor Andrews, Jason
Thesis advisor Jagannathan, Prasanna
Thesis advisor Utz, Paul
Degree committee member Andrews, Jason
Degree committee member Jagannathan, Prasanna
Degree committee member Utz, Paul
Associated with Stanford University, Department of Immunology

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Aditya Manohar Rao.
Note Submitted to the Department of Immunology.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/qn911xh9571

Access conditions

Copyright
© 2021 by Aditya Manohar Rao
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...