Demystifying unsupervised feature learning

Coates, Adam Paul; Stanford University, Computer Science Department

Demystifying unsupervised feature learning

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Ftq870hw5757" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Machine learning is a key component of state-of-the-art systems in many application domains. Applied to many kinds of raw data, however, most learning algorithms are unable to make good predictions. In order to succeed, most learning algorithms are applied instead to "features" that represent higher-level concepts extracted from the raw data. These features, developed by expert practitioners in each field, encode important prior knowledge about the task that the learning algorithm would be unable to discover on its own from (often limited) labeled training examples. Unfortunately, engineering good feature representations for new applications is extremely difficult. For the most challenging applications in AI, like computer vision, the search for good features and higher-level image representations is vast and ongoing. In this work we study a class of algorithms that attempt to learn feature representations automatically from unlabeled data that is often easy to obtain in large quantities. Though many such algorithms have been proposed and have achieved high marks on benchmark tasks, it has not been fully understood what causes some algorithms to perform well and others to perform poorly. It has thus been difficult to identify any key directions in which the algorithms might be improved in order to significantly advance the state of the art. To address this issue, we will present results from an in-depth scientific study of a variety of factors that can affect the performance of feature-learning algorithms. Through a detailed analysis, a surprising picture emerges: we find that many schemes succeed or fail as a result of a few (easily overlooked) factors that are often orthogonal to the particular learning methods involved. In fact, by focusing solely on these factors it is possible to achieve state-of-the-art performance on common benchmarks using quite simple algorithms. More importantly, however, a main contribution of this line of research has been to identify very simple yet highly scalable feature learning methods that, by virtue of focusing on the most critical properties identified in our study, are highly successful in many settings: the proposed algorithms consistently achieve top performance on benchmarks, have been successfully deployed in realistic computer vision applications, and are even capable of discovering high-level concepts like object classes without any supervision.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2012
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Coates, Adam Paul
Associated with	Stanford University, Computer Science Department
Primary advisor	Ng, Andrew Y, 1976-
Thesis advisor	Ng, Andrew Y, 1976-
Thesis advisor	Koller, Daphne
Thesis advisor	Li, Fei Fei, 1976-
Advisor	Koller, Daphne
Advisor	Li, Fei Fei, 1976-

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	Adam Coates.
Note	Submitted to the Department of Computer Science.
Thesis	Thesis (Ph.D.)--Stanford University, 2012.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...