Deep learning on local sites for protein structure and function analysis
Abstract/Contents
- Abstract
- Understanding how the three-dimensional structure of a protein leads to its function is important for determining disease mechanisms, developing targeted therapeutics, and engineering new proteins with desired functional characteristics. The expansion of protein structure databases due to experimental and computational advances provides an unprecedented opportunity to learn structure-function relationships in a data-driven manner. Deep learning methods that operate on protein structures have shown promise for specific tasks, but their utility for functional analysis has been limited due to inconsistencies in model training and evaluation, lack of labeled protein function data, and an inability to reconcile global predictions with local biochemical mechanisms. In this dissertation, I explore these challenges and propose a framework for protein analysis based on learning on local sites rather than the entire protein structure. First, to establish standards for model development and evaluation, I present work on (1) developing a suite of benchmark datasets, processing tools, and baseline models, and (2) quantifying the effect of differing structure compositions in the training data. I then describe a self-supervised learning method that leverages evolutionary relationships to learn general-purpose representations of local structural sites, and show how these representations enable improved performance on downstream tasks involving classification, search, and annotation of functional sites. By clustering millions of sites, I propose a framework for protein analysis based on conserved structural motifs which enables the discovery of functional relationships across protein classes. Finally, I present a method for explainable function annotation that predicts the overall function of a protein as well as the individual residues which are responsible.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2024; ©2024 |
Publication date | 2024; 2024 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Derry, Alexander William Fox |
---|---|
Degree supervisor | Altman, Russ |
Thesis advisor | Altman, Russ |
Thesis advisor | Dror, Ron, 1975- |
Thesis advisor | Huang, Possu |
Thesis advisor | Leskovec, Jurij |
Degree committee member | Dror, Ron, 1975- |
Degree committee member | Huang, Possu |
Degree committee member | Leskovec, Jurij |
Associated with | Stanford University, School of Medicine |
Associated with | Stanford University, Department of Biomedical Data Science |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Alexander William Fox Derry. |
---|---|
Note | Submitted to the Department of Biomedical Data Science. |
Thesis | Thesis Ph.D. Stanford University 2024. |
Location | https://purl.stanford.edu/zy646hk2720 |
Access conditions
- Copyright
- © 2024 by Alexander William Fox Derry
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...