Artificial intelligence methods for discovery in large biobanks
Abstract/Contents
- Abstract
- Large-scale biobanks, housing vast genetic and phenotypic data, are driving scientific discoveries across a wide range of diseases. However, harnessing the full potential of biobanks requires innovative methodologies that address challenges in phenotype recognition, genetic association studies, and multimorbidity. First, in order to define the disease cohorts accurately, we must recognize phenotypes that may not be labelled in the primary data. To address this challenge, we developed an AI-based method called POPDx (Population-based Objective Phenotyping by Deep Extrapolation) that computes disease liabilities for 12,803 ICD-10 codes and 1538 Phenotype codes for all participants in the UK Biobank. Second, the genetic data in biobanks are often used to conduct association studies in order to understand the genetic architecture of key health traits. These studies are commonly set up as case-control, but the "healthy" may evolve over time and become cases. We demonstrate that our disease liability estimates from patient phenotyping allow us to improve downstream genetic discovery by mapping disease risk to a quantitative scale that provides greater statistical power compared to the dichotomous designs. Finally, multimorbidity (the coexistence of multiple diseases in an individual) provides an opportunity to understand disorders that may share genetic or environmental risk factors. Therefore, we present ForeSITE (Forecasting Susceptibility to Illness with Transformer Embeddings), an automatic framework powered by a GPT-style architecture that models disease trajectories and can predict likely future diseases. These new capabilities, both alone and in combination, enhance the utility of large-scale biobanks for scientific discoveries. In particular, our contributions to phenotype recognition, genetic association studies, and multimorbidity modeling pave the way for improved disease understanding and personalized healthcare interventions.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2023; ©2023 |
Publication date | 2023; 2023 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Yang, Lu, (Researcher in bioengineering) |
---|---|
Degree supervisor | Altman, Russ |
Thesis advisor | Altman, Russ |
Thesis advisor | Leskovec, Jurij |
Thesis advisor | Wall, Dennis Paul |
Degree committee member | Leskovec, Jurij |
Degree committee member | Wall, Dennis Paul |
Associated with | Stanford University, School of Engineering |
Associated with | Stanford University, Department of Bioengineering |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Lu Yang. |
---|---|
Note | Submitted to the Department of Bioengineering. |
Thesis | Thesis Ph.D. Stanford University 2023. |
Location | https://purl.stanford.edu/jn115rp0937 |
Access conditions
- Copyright
- © 2023 by Lu Yang
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...