Label-efficient machine learning for medical image analysis

Hooper, Sarah McIlwaine

Label-efficient machine learning for medical image analysis

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fmx843ys4683" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Medical imaging is an essential tool in healthcare, and radiologists are highly trained to detect and characterize disease in medical images. However, relying solely on human analysis has limitations: it can be time consuming, variable, and difficult to scale. Automating portions of the medical image analysis pipeline can overcome these limitations to support and expand the capabilities of clinicians and radiologists. In this dissertation, we focus on the potentially transformative role deep learning will play in automated medical image analysis. We pose segmentation as a key tool for deep learning-based image analysis, and we show how segmentation neural networks can achieve high performance on many medical image analysis tasks without large, manually annotated training datasets. We begin by describing two methods for training medical image segmentation neural networks with limited labeled data. In our first method, we adapt weak supervision to segmentation. In our second method, we fuse data augmentation, consistency regularization, and pseudo labeling in a unified semi-supervision pipeline. These methods fold multiple approaches to limited-label training into the same framework, leveraging the strengths of each to achieve high performance while keeping labeling burden low. Next, we evaluate networks trained with limited labeled data on clinically motivated metrics over multi-institution, multi-scanner, multi-disease datasets. We find that our semi-supervised networks achieve improved performance compared to fully supervised networks (trained with over 100x more labeled data) on certain generalization tasks, achieving stronger concordance with a human annotator. However, we uncover data subsets on which the label-efficient methods underperform. We propose an active learning extension to our semi-supervised pipeline to address these error modes, improving semi-supervised performance on a difficult data slice by 18.5%. Through this evaluation, we develop an understanding of how networks trained with limited labeled data perform on clinical tasks, how they compare to networks trained with abundant labeled data, and how to mitigate error modes. Finally, we apply label-efficient segmentation models to a broader set of medical image analysis tasks. Specifically, we demonstrate how and why segmentation can benefit medical image classification. We first analyze why segmentation versus classification models may achieve different performances on the same dataset and task. We then implement methods for using segmentation models to classify medical images, which we call segmentation-for-classification, and compare these methods against traditional classification on three retrospective datasets. Finally, we use our analysis and experiments to summarize the benefits of using segmentation-for-classification compared to standard classification, including: improved sample efficiency, enabling improved performance with fewer labeled images (up to an order of magnitude fewer), on low-prevalence classes, and on certain rare subgroups (up to 161.1% improved recall); improved robustness to spurious correlations (up to 44.8% improved robust AUROC); and improved model interpretability, evaluation, and error analysis. These results show that leveraging segmentation models can lead to higher-quality medical image classifiers in common settings. In summary, this dissertation focuses on segmentation as a key tool for supporting automated medical image analysis, and we show how to train segmentation networks to achieve high performance on many image analysis tasks without large labeling burdens.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2023; ©2023
Publication date	2023; 2023
Issuance	monographic
Language	English

Creators/Contributors

Author	Hooper, Sarah McIlwaine
Degree supervisor	Langlotz, Curtis P
Degree supervisor	Ré, Christopher
Thesis advisor	Langlotz, Curtis P
Thesis advisor	Ré, Christopher
Thesis advisor	Nishimura, Dwight George
Thesis advisor	Olukotun, Oyekunle Ayinde
Degree committee member	Nishimura, Dwight George
Degree committee member	Olukotun, Oyekunle Ayinde
Associated with	Stanford University, School of Engineering
Associated with	Stanford University, Department of Electrical Engineering

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Sarah Hooper.
Note	Submitted to the Department of Electrical Engineering.
Thesis	Thesis Ph.D. Stanford University 2023.
Location	https://purl.stanford.edu/mx843ys4683

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...