Improving and leveraging the interpretability of deep neural networks for genomics

Tseng, Alex Michael

Improving and leveraging the interpretability of deep neural networks for genomics

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fjv141vb2060" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: In recent years, the field of genomics has been characterized by an extraordinary influx of novel high-throughput technologies and techniques. These methods produce enormous datasets which quantify cellular state across several axes of measurement. Given their large size and complexity, machine-learning algorithms (particularly deep neural networks) have increasingly been relied upon to ingest and model these datasets. In fitting to the data, these neural networks demonstrably learn the overarching patterns and subtle nuances in the underpinning biology. In order to be most useful for scientific discovery, however, these models need to be able to convert their learned scientific principles into a form that human scientists can then understand and use. Unfortunately, this transfer of knowledge has been limited due to the overall uninterpretability of deep neural networks, and the difficulty of distilling a model's individual decisions across a dataset into a set of human-understandable rules. In this thesis, I will present my work on addressing these challenges. First, I will discuss my development of the Fourier-transform-based attribution prior, which trains deep neural networks to be more stable and interpretable, thereby allowing them to more consistently and reliably reveal the biological patterns driving various genome-regulatory events. Subsequently, I will present my work on a computational framework which distills and summarizes a neural network's individual decisions into a small set of global protein-binding rules that can then be understood by a human scientist. Through several case studies, I will show that these methods can be applied in a real-world setting for scientific discovery. Although the methods presented in this thesis are primarily applied to the problem of genomics, many ideas discussed here are easily (if not directly) applicable to many other domains of machine learning. Together, these developments represent a major advancement in how humans extract learned information from deep neural networks. This can have resounding improvements and impacts in not only how neural networks can be used for science, but also in our understanding of how they internally learn in general.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2022; ©2022
Publication date	2022; 2022
Issuance	monographic
Language	English

Creators/Contributors

Author	Tseng, Alex Michael
Degree supervisor	Kundaje, Anshul, 1980-
Thesis advisor	Kundaje, Anshul, 1980-
Thesis advisor	Fordyce, Polly
Thesis advisor	Horowitz, Mark (Mark Alan)
Degree committee member	Fordyce, Polly
Degree committee member	Horowitz, Mark (Mark Alan)
Associated with	Stanford University, Computer Science Department

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Alex Michael Tseng.
Note	Submitted to the Computer Science Department.
Thesis	Thesis Ph.D. Stanford University 2022.
Location	https://purl.stanford.edu/jv141vb2060

Access conditions

License: This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC BY).

Also listed in

View in SearchWorks

Loading usage metrics...