Empowering machine learning systems for drug discovery with mechanistic biological knowledge
Abstract/Contents
- Abstract
- Machine learning has the potential to solve critical tasks in drug discovery, from identifying new therapeutic uses for drugs to personalizing treatment plans for patients. However, datasets in drug discovery are often small due to the high time and labor cost of experiments, limiting the applicability of machine learning systems. Here, our key insight is to infuse machine learning systems with prior information about biology, so they can learn efficiently from small, labeled datasets. We find a way to structure diverse prior information, from molecular interactions between proteins to clinical annotations about diseases, into a heterogeneous knowledge graph. We then develop two machine learning systems that use the knowledge graph to mechanistically model biological phenomena and achieve strong performance. First, we develop the multiscale interactome, a machine learning system that uses a knowledge graph to model how drugs treat diseases across multiple scales of biology. The multiscale interactome predicts which drugs treat a disease up to 40% more effectively than the prior state-of-the-art, identifies proteins and biological functions relevant to treatment, and predicts genes that alter the treatment's efficacy and side effects. Second, we develop PLATO, a deep learning system that uses a knowledge graph to achieve strong performance on tabular datasets with orders of magnitude more features than samples (i.e. "small" labeled datasets). In PLATO, the knowledge graph is auxiliary to the tabular dataset and describes input features, like genes. PLATO uses the knowledge graph to infer the weights of a multilayer perceptron, thereby using prior information to learn efficiently from a small, labeled dataset. Across 6 datasets, PLATO outperforms the prior state-of-the-art by up to 10.19%. Ultimately, we provide a general framework to empower data-driven, machine learning systems with an extensive, mechanistic knowledge of biology.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2023; ©2023 |
Publication date | 2023; 2023 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Ruiz, Camilo Andres |
---|---|
Degree supervisor | Leskovec, Jurij |
Thesis advisor | Leskovec, Jurij |
Thesis advisor | Altman, Russ |
Thesis advisor | Snyder, Michael, Ph. D. |
Degree committee member | Altman, Russ |
Degree committee member | Snyder, Michael, Ph. D. |
Associated with | Stanford University, School of Engineering |
Associated with | Stanford University, Department of Bioengineering |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Camilo Ruiz. |
---|---|
Note | Submitted to the Department of Bioengineering. |
Thesis | Thesis Ph.D. Stanford University 2023. |
Location | https://purl.stanford.edu/mc795nz5480 |
Access conditions
- Copyright
- © 2023 by Camilo Andres Ruiz
- License
- This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC BY).
Also listed in
Loading usage metrics...