Multiclass learning for visual recognition
- Extracting semantic information from raw visual input is one of the long-term goals of computer vision. While humans can almost effortlessly recognize hundreds or thousands of object and scene categories, it poses great computational challenges to design machine learning algorithms that can effectively and efficiently learn and recognize a rich set of concepts. In this work, we consider three challenges when performing multiclass visual perception tasks such as object recognition/detection and scene categorization. First, we address the challenge of scaling up the classifier to a large number of classes in terms of both computational efficiency and statistical accuracy. We propose a learning algorithm that can automatically discover and utilize the hierarchical structure in the semantic label space from data. Empirically, the resulting hierarchical classifier achieves significant improvement over existing methods on both object recognition and scene categorization tasks with hundreds of classes. Second, while image training data is abundant for some object classes, recent studies confirm the heavy-tailed nature of their distribution over categories, fueling the need for learning algorithms that make efficient use of scarce training data. To this end, we study transfer learning applied to multiclass category-level object detection. Built on the HOG (histogram of gradient) feature and template matching detector, we propose a local, spatial and structured prior that can capture the correlation structures on the level of individual features and spatially neighboring pairs of features. We demonstrate improved average performance over all 20 Pascal classes compared to the state-of-the-art deformable part-based model. Last, the broadness of visual classes usually necessitates an ensemble of heterogeneous features to provide good discriminative power but at an expensive computational cost. We propose an active classification scheme to perform efficient inference with an ensemble of features/classifiers, where we view the classification inference task as a sensing problem. Observations, i.e., features/classifiers, are selected dynamically based on previous observations, using a value-theoretic computation that balances an estimate of the expected classification gain from each observation as well as its computational cost. We show that the active scheme can achieve comparable or even higher classification accuracy at a fraction of the computational costs of traditional methods.
|Type of resource
|electronic; electronic resource; remote
|1 online resource.
|Stanford University, Department of Electrical Engineering
|Ng, Andrew Hock-soon, 1972-
|Ng, Andrew Hock-soon, 1972-
|Statement of responsibility
|Submitted to the Department of Electrical Engineering.
|Thesis (Ph.D.)--Stanford University, 2012.
- © 2012 by Tianshi Gao
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...