Entity resolution and tracking on social networks
- In this thesis we study two interesting aspects of the problem of Entity Resolution (ER). The goal of ER is to identify and merge records that refer to the same underlying entity. The recent rise in adoption of social networks (Facebook, Google+, Twitter, and others) introduces new issues and twists to the traditional ER problem: crowdsourcing and limited information. We first study a hybrid human-machine approach to solving ER problems. Machine learning models can predict the probabilities of entity pairs referring to the same entity. However, machines make mistakes. Humans can help verify the equality of entity pairs, and social systems like Facebook allow users to help resolve entities on their platforms. We propose hybrid human-machine strategies with theoretical guarantees that leverage transitivity relations (e.g. a = c can be inferred given a = b and b = c). Next, we study the problem of ER with limited information. Social systems impose limits on API calls that constrain access to their full social graphs. We focus on the resolution of a single node g from one social graph G against a second social graph T. We want to find the best match for g in T, by dynamically probing T (using a public API), limited by the number of API calls that these social systems allow. We propose two ER strategies that are designed for limited information and can be adapted to different API limits. Finally, we study the problem of updating social graph snapshots when one has limited information. Effective social network ER requires up-to-date snapshots. Limited by the number of API calls that social systems allow, we seek to efficiently update a snapshot. We want to avoid re-crawling all of the nodes and minimize the number of API calls. We propose novel snapshot update strategies that are designed for limited information and can be adapted to different levels of staleness.
|Type of resource
|electronic; electronic resource; remote
|1 online resource.
|Stanford University, Department of Computer Science.
|Statement of responsibility
|Submitted to the Department of Computer Science.
|Thesis (Ph.D.)--Stanford University, 2016.
- © 2016 by Norases Vesdapunt
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...