Enhancing machine learning with data-efficient methods

Placeholder Show Content


Supervised deep learning techniques have made a tremendous and unprecedented impact in all segments of our lives, including finance, healthcare, social networks, and more. However, the progress is hindered by a substantial challenge: the dependence on large, high-quality labeled datasets. This issue is particularly acute in areas such as biomedicine, where the procurement and annotation of data are not only costly but also intricate. In response to these challenges, this thesis introduces innovative machine learning strategies that are data-efficient, aiming to reduce the dependence on extensive labeled datasets while either preserving or improving the efficacy of deep learning models. The thesis is systematically divided into two primary sections, each targeting key aspects of data-efficient machine learning. Part I is dedicated to the development of advanced algorithms optimized for existing datasets, particularly under the constraint of limited labeling. This section introduces a novel machine learning setting for enhancing generalization and robustness in low-label scenarios, proposes an innovative open-world semi-supervised learning framework, and adapts this framework to real-world applications. Part II focuses on augmenting training resources by incorporating supplementary knowledge. It explores the integration of auxiliary tasks to enhance training, examines the use of historical data to improve AutoML search efficiency, and introduces methods for including large datasets that were previously unmanageable due to memory constraints.


Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2024; ©2024
Publication date 2024; 2024
Issuance monographic
Language English


Author Cao, Kaidi
Degree supervisor Leskovec, Jurij
Thesis advisor Leskovec, Jurij
Thesis advisor Koyejo, Oluwasanmi
Thesis advisor Ma, Tengyu
Degree committee member Koyejo, Oluwasanmi
Degree committee member Ma, Tengyu
Associated with Stanford University, School of Engineering
Associated with Stanford University, Computer Science Department


Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Kaidi Cao.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2024.
Location https://purl.stanford.edu/mt928bm8533

Access conditions

© 2024 by Kaidi Cao
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...