Principled approaches to modern machine learning challenges

Gupta, Neha, (Computer scientist)

Principled approaches to modern machine learning challenges

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fyd023pm9683" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Learning based models have been successfully deployed in various sciences and engineering applications. Despite the empirical successes, there is little understanding about the workings and the failure modes of these models. In this thesis, we focus on the following questions: a) when and why do these models generalize beyond the set of training data, and b) what conclusions can one reliably make when the data available is insufficient to learn accurate models. In the first part, we focus on provable generalization guarantees. We take the first steps towards understanding why deep neural networks generalize to unseen data despite having a larger number of parameters than the available data points to learn from. We characterize the ``simple" behaviour of the functions learned by neural networks in certain settings, which in turn can be shown to generalize well on unseen data. We next design algorithms for provably learning decision trees inspired from the top-down heuristics commonly used in practice. The key idea is to use a small subset of features to determine the splitting criterion at each node of the decision tree, instead of a single one as is currently used by practitioners. Our proposed algorithm is both sample and time efficient, and allows us to provide generalization guarantees for decision functions which are well represented by small decision trees. In the second part, we focus on what conclusions can we make when do not have sufficient labeled data. Learning methods, especially deep neural networks, require large amounts of labeled training data to learn meaningful predictive models. In many domains, such labeled data comes as a result of performing extensive experiments, thus restricting the amount of data one can receive. Given such limited data, one is often interested in knowing what useful conclusions can we reliably make when the data is insufficient to learn any reliable predictor. For example, if we are given the test points in advance, is it easier to predict the labels of those given test points rather than learning the entire model? In this part of the thesis, we show that given a particular test point along with a batch of unlabeled data set, one needs much fewer active label queries (independent of the complexity of the function class) to make a good prediction on this test point as compared to learning the complete function for several hypothesis classes.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2022; ©2022
Publication date	2022; 2022
Issuance	monographic
Language	English

Creators/Contributors

Author	Gupta, Neha, (Computer scientist)
Degree supervisor	Charikar, Moses
Degree supervisor	Valiant, Gregory
Thesis advisor	Charikar, Moses
Thesis advisor	Valiant, Gregory
Thesis advisor	Tan, Li-Yang
Degree committee member	Tan, Li-Yang
Associated with	Stanford University, Computer Science Department

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Neha Gupta.
Note	Submitted to the Computer Science Department.
Thesis	Thesis Ph.D. Stanford University 2022.
Location	https://purl.stanford.edu/yd023pm9683

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...