Multiclass learning for visual recognition

Placeholder Show Content

Abstract/Contents

Abstract
Extracting semantic information from raw visual input is one of the long-term goals of computer vision. While humans can almost effortlessly recognize hundreds or thousands of object and scene categories, it poses great computational challenges to design machine learning algorithms that can effectively and efficiently learn and recognize a rich set of concepts. In this work, we consider three challenges when performing multiclass visual perception tasks such as object recognition/detection and scene categorization. First, we address the challenge of scaling up the classifier to a large number of classes in terms of both computational efficiency and statistical accuracy. We propose a learning algorithm that can automatically discover and utilize the hierarchical structure in the semantic label space from data. Empirically, the resulting hierarchical classifier achieves significant improvement over existing methods on both object recognition and scene categorization tasks with hundreds of classes. Second, while image training data is abundant for some object classes, recent studies confirm the heavy-tailed nature of their distribution over categories, fueling the need for learning algorithms that make efficient use of scarce training data. To this end, we study transfer learning applied to multiclass category-level object detection. Built on the HOG (histogram of gradient) feature and template matching detector, we propose a local, spatial and structured prior that can capture the correlation structures on the level of individual features and spatially neighboring pairs of features. We demonstrate improved average performance over all 20 Pascal classes compared to the state-of-the-art deformable part-based model. Last, the broadness of visual classes usually necessitates an ensemble of heterogeneous features to provide good discriminative power but at an expensive computational cost. We propose an active classification scheme to perform efficient inference with an ensemble of features/classifiers, where we view the classification inference task as a sensing problem. Observations, i.e., features/classifiers, are selected dynamically based on previous observations, using a value-theoretic computation that balances an estimate of the expected classification gain from each observation as well as its computational cost. We show that the active scheme can achieve comparable or even higher classification accuracy at a fraction of the computational costs of traditional methods.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2012
Issuance monographic
Language English

Creators/Contributors

Associated with Gao, Tianshi
Associated with Stanford University, Department of Electrical Engineering
Primary advisor Koller, Daphne
Primary advisor Ng, Andrew Hock-soon, 1972-
Thesis advisor Koller, Daphne
Thesis advisor Ng, Andrew Hock-soon, 1972-
Thesis advisor Girod, Bernd
Advisor Girod, Bernd

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Tianshi Gao.
Note Submitted to the Department of Electrical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2012.
Location electronic resource

Access conditions

Copyright
© 2012 by Tianshi Gao
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...