Deep learning on local sites for protein structure and function analysis

Placeholder Show Content

Abstract/Contents

Abstract
Understanding how the three-dimensional structure of a protein leads to its function is important for determining disease mechanisms, developing targeted therapeutics, and engineering new proteins with desired functional characteristics. The expansion of protein structure databases due to experimental and computational advances provides an unprecedented opportunity to learn structure-function relationships in a data-driven manner. Deep learning methods that operate on protein structures have shown promise for specific tasks, but their utility for functional analysis has been limited due to inconsistencies in model training and evaluation, lack of labeled protein function data, and an inability to reconcile global predictions with local biochemical mechanisms. In this dissertation, I explore these challenges and propose a framework for protein analysis based on learning on local sites rather than the entire protein structure. First, to establish standards for model development and evaluation, I present work on (1) developing a suite of benchmark datasets, processing tools, and baseline models, and (2) quantifying the effect of differing structure compositions in the training data. I then describe a self-supervised learning method that leverages evolutionary relationships to learn general-purpose representations of local structural sites, and show how these representations enable improved performance on downstream tasks involving classification, search, and annotation of functional sites. By clustering millions of sites, I propose a framework for protein analysis based on conserved structural motifs which enables the discovery of functional relationships across protein classes. Finally, I present a method for explainable function annotation that predicts the overall function of a protein as well as the individual residues which are responsible.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2024; ©2024
Publication date 2024; 2024
Issuance monographic
Language English

Creators/Contributors

Author Derry, Alexander William Fox
Degree supervisor Altman, Russ
Thesis advisor Altman, Russ
Thesis advisor Dror, Ron, 1975-
Thesis advisor Huang, Possu
Thesis advisor Leskovec, Jurij
Degree committee member Dror, Ron, 1975-
Degree committee member Huang, Possu
Degree committee member Leskovec, Jurij
Associated with Stanford University, School of Medicine
Associated with Stanford University, Department of Biomedical Data Science

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Alexander William Fox Derry.
Note Submitted to the Department of Biomedical Data Science.
Thesis Thesis Ph.D. Stanford University 2024.
Location https://purl.stanford.edu/zy646hk2720

Access conditions

Copyright
© 2024 by Alexander William Fox Derry
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...