Human-powered data management

Placeholder Show Content

Abstract/Contents

Abstract
Fully automated algorithms are inadequate for a number of data analysis tasks, especially those involving images, video, or text. Thus, there is often a need to combine "human computation" (or crowdsourcing), together with traditional computation, in order to improve the process of understanding and analyzing data. However, most data management applications currently employ crowdsourcing in an ad-hoc fashion; these applications are not optimized for low monetary cost, low latency, or high accuracy. In this thesis, we develop a formalism for reasoning about human-powered data management, and use this formalism to design: (a) a toolbox of basic data processing algorithms, optimized for cost, latency, and accuracy, and (b) practical data management systems and applications that use these algorithms. We demonstrate that our techniques lead to algorithms and systems that expend very few resources (e.g., time waiting, human effort, or money spent), while providing just as high quality results, as compared to approaches used in practice.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2013
Issuance monographic
Language English

Creators/Contributors

Associated with Parameswaran, Aditya
Associated with Stanford University, Department of Computer Science.
Primary advisor Garcia-Molina, Hector
Thesis advisor Garcia-Molina, Hector
Thesis advisor Polyzotis, Neoklis
Thesis advisor Widom, Jennifer
Advisor Polyzotis, Neoklis
Advisor Widom, Jennifer

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Aditya G. Parameswaran.
Note Submitted to the Department of Computer Science.
Thesis Ph.D. Stanford University 2013
Location electronic resource

Access conditions

Copyright
© 2013 by Aditya Ganesh Parameswaran
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...