Improving and leveraging the interpretability of deep neural networks for genomics
Abstract/Contents
- Abstract
- In recent years, the field of genomics has been characterized by an extraordinary influx of novel high-throughput technologies and techniques. These methods produce enormous datasets which quantify cellular state across several axes of measurement. Given their large size and complexity, machine-learning algorithms (particularly deep neural networks) have increasingly been relied upon to ingest and model these datasets. In fitting to the data, these neural networks demonstrably learn the overarching patterns and subtle nuances in the underpinning biology. In order to be most useful for scientific discovery, however, these models need to be able to convert their learned scientific principles into a form that human scientists can then understand and use. Unfortunately, this transfer of knowledge has been limited due to the overall uninterpretability of deep neural networks, and the difficulty of distilling a model's individual decisions across a dataset into a set of human-understandable rules. In this thesis, I will present my work on addressing these challenges. First, I will discuss my development of the Fourier-transform-based attribution prior, which trains deep neural networks to be more stable and interpretable, thereby allowing them to more consistently and reliably reveal the biological patterns driving various genome-regulatory events. Subsequently, I will present my work on a computational framework which distills and summarizes a neural network's individual decisions into a small set of global protein-binding rules that can then be understood by a human scientist. Through several case studies, I will show that these methods can be applied in a real-world setting for scientific discovery. Although the methods presented in this thesis are primarily applied to the problem of genomics, many ideas discussed here are easily (if not directly) applicable to many other domains of machine learning. Together, these developments represent a major advancement in how humans extract learned information from deep neural networks. This can have resounding improvements and impacts in not only how neural networks can be used for science, but also in our understanding of how they internally learn in general.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2022; ©2022 |
Publication date | 2022; 2022 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Tseng, Alex Michael | |
---|---|---|
Degree supervisor | Kundaje, Anshul, 1980- | |
Thesis advisor | Kundaje, Anshul, 1980- | |
Thesis advisor | Fordyce, Polly | |
Thesis advisor | Horowitz, Mark (Mark Alan) | |
Degree committee member | Fordyce, Polly | |
Degree committee member | Horowitz, Mark (Mark Alan) | |
Associated with | Stanford University, Computer Science Department |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Alex Michael Tseng. |
---|---|
Note | Submitted to the Computer Science Department. |
Thesis | Thesis Ph.D. Stanford University 2022. |
Location | https://purl.stanford.edu/jv141vb2060 |
Access conditions
- Copyright
- © 2022 by Alex Michael Tseng
- License
- This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC BY).
Also listed in
Loading usage metrics...