Multiclass learning for visual recognition

Gao, Tianshi; Stanford University, Department of Electrical Engineering

Multiclass learning for visual recognition

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Ftr686pz4969" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Extracting semantic information from raw visual input is one of the long-term goals of computer vision. While humans can almost effortlessly recognize hundreds or thousands of object and scene categories, it poses great computational challenges to design machine learning algorithms that can effectively and efficiently learn and recognize a rich set of concepts. In this work, we consider three challenges when performing multiclass visual perception tasks such as object recognition/detection and scene categorization. First, we address the challenge of scaling up the classifier to a large number of classes in terms of both computational efficiency and statistical accuracy. We propose a learning algorithm that can automatically discover and utilize the hierarchical structure in the semantic label space from data. Empirically, the resulting hierarchical classifier achieves significant improvement over existing methods on both object recognition and scene categorization tasks with hundreds of classes. Second, while image training data is abundant for some object classes, recent studies confirm the heavy-tailed nature of their distribution over categories, fueling the need for learning algorithms that make efficient use of scarce training data. To this end, we study transfer learning applied to multiclass category-level object detection. Built on the HOG (histogram of gradient) feature and template matching detector, we propose a local, spatial and structured prior that can capture the correlation structures on the level of individual features and spatially neighboring pairs of features. We demonstrate improved average performance over all 20 Pascal classes compared to the state-of-the-art deformable part-based model. Last, the broadness of visual classes usually necessitates an ensemble of heterogeneous features to provide good discriminative power but at an expensive computational cost. We propose an active classification scheme to perform efficient inference with an ensemble of features/classifiers, where we view the classification inference task as a sensing problem. Observations, i.e., features/classifiers, are selected dynamically based on previous observations, using a value-theoretic computation that balances an estimate of the expected classification gain from each observation as well as its computational cost. We show that the active scheme can achieve comparable or even higher classification accuracy at a fraction of the computational costs of traditional methods.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2012
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Gao, Tianshi
Associated with	Stanford University, Department of Electrical Engineering
Primary advisor	Koller, Daphne
Primary advisor	Ng, Andrew Hock-soon, 1972-
Thesis advisor	Koller, Daphne
Thesis advisor	Ng, Andrew Hock-soon, 1972-
Thesis advisor	Girod, Bernd
Advisor	Girod, Bernd

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	Tianshi Gao.
Note	Submitted to the Department of Electrical Engineering.
Thesis	Thesis (Ph.D.)--Stanford University, 2012.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...