Visual word recognition with large-scale image retrieval
Abstract/Contents
- Abstract
- OCR techniques that were developed for scanned high-resolution images suffer from poor text recognition accuracy on camera images. In this dissertation, we cast text recognition as a word patch retrieval problem. By comparing visual text queries against a database of labeled word images, we demonstrate a text recognition system that achieves significantly improved performance on camera images. Robust visual text features are the foundation for optimal word patch retrieval. We introduce Text Aggregated Gradients (TAG), a visual text descriptor discriminatively learned using easily obtainable training data. At the query stage, query words can be accurately recognized via Approximate Nearest Neighbor search. In the interest of system scalability to a large number of fonts, a font recognition method is developed to facilitate fast query retrieval. We show that by employing a Bayesian network, surprisingly good font recognition accuracy can be achieved via probabilistic inference. A compact database is highly desirable for our large-scale word patch retrieval. To eliminate inter-font redundancies from the database, we perform feature aggregation within groups of similar fonts. Using Canonical Correlation Analysis, our database can be compressed by a factor of 5x without any loss of word retrieval accuracy.
Description
Type of resource | text |
---|---|
Form | electronic; electronic resource; remote |
Extent | 1 online resource. |
Publication date | 2015 |
Issuance | monographic |
Language | English |
Creators/Contributors
Associated with | Chen, Huizhong | |
---|---|---|
Associated with | Stanford University, Department of Electrical Engineering. | |
Primary advisor | Girod, Bernd | |
Thesis advisor | Girod, Bernd | |
Thesis advisor | Guibas, Leonidas J | |
Thesis advisor | Pauly, John (John M.) | |
Advisor | Guibas, Leonidas J | |
Advisor | Pauly, John (John M.) |
Subjects
Genre | Theses |
---|
Bibliographic information
Statement of responsibility | Huizhong Chen. |
---|---|
Note | Submitted to the Department of Electrical Engineering. |
Thesis | Thesis (Ph.D.)--Stanford University, 2015. |
Location | electronic resource |
Access conditions
- Copyright
- © 2015 by Huizhong Chen
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...