Efficient and robust deep learning for medical imaging and natural language processing
- Deep learning (DL) has made remarkable progress in fields such as medical imaging and natural language processing (NLP). However, several challenges remain which limit its applicability in real-world settings. Firstly, solving complex tasks require large neural networks with high expressivity, which poses significant challenges in terms of time and memory efficiency in high-dimensional settings such as accelerated magnetic resonance imaging (MRI) reconstruction. Secondly, DL algorithms are often sensitive to distribution shifts between training and testing. For example, DL-based MR reconstruction methods fail dramatically under clinically-relevant distribution shifts such as noise, scanner-induced drifts, and anatomical changes. Similarly, in NLP, Large Language Models (LLMs) are sensitive to changes in the format of text inputs (prompts) such as order of words in a prompt. Thus, it is crucial to develop algorithms that are time and memory efficient, with improved robustness against distribution shifts. In this thesis, we address efficiency and robustness issues of existing DL techniques in a series of projects. First, we describe GLEAM, a memory efficient training strategy for MRI reconstruction that splits an end-to-end neural network into decoupled network modules. GLEAM leads to significant improvements in time and memory efficiency while improving reconstruction performance in high-dimensional settings. Then, we describe a consistency training method that uses both fully-sampled and undersampled scans for noise-robust MRI reconstruction called Noise2Recon. We show that Noise2Recon improves robustness over existing DL techniques using less amount of labeled data under low signal-to-noise ratio settings, and when generalizing to out-of-distribution acceleration factors. Next, we discuss methods to improve robustness of MRI reconstruction using diffusion models. The first method, termed SMRD, performs automatic hyperparameter selection at test time to enhance robustness under clinically-relevant distribution shifts. SMRD improves robustness under out-of-distribution measurement noise levels, acceleration factors, and anatomies, achieving a PSNR improvement of up to 6 dB under measurement noise. The second method, termed RED-diff, uses a variational inference approach based on a measurement consistency loss and a score matching regularization. RED-diff achieves 3× faster inference while using the same amount of memory. Finally, we present an efficient and robust probabilistic inference method for natural language reasoning termed ThinkSum. ThinkSum is a two-stage probabilistic inference algorithm which reasons over sets of objects or facts in a structured manner. In the first stage, a LLM is queried in parallel over a set of phrases extracted from the prompt or an auxiliary model call. In the second stage, the results of these queries are aggregated to make the final prediction. We show that ThinkSum improves performance on difficult NLP tasks and is more robust to prompt design compared to standard prompting techniques. Additionally, we show that ThinkSum can process the parallel queries to LLMs simultaneously to improve efficiency.
|Type of resource
|electronic resource; remote; computer; online resource
|1 online resource.
|Ozturkler, Batu Mehmet
|Degree committee member
|Degree committee member
|Stanford University, School of Engineering
|Stanford University, Department of Electrical Engineering
|Statement of responsibility
|Submitted to the Department of Electrical Engineering.
|Thesis Ph.D. Stanford University 2023.
- © 2023 by Batu Mehmet Ozturkler
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...