Network formation as a choice process

Overgoor, Jan Surya

Network formation as a choice process

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fxh962qh7961" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: Understanding why networks form and evolve the way they do is a core goal of many scientific disciplines ranging from the social to the physical sciences. Across these disciplines, many kinds of formation models have been employed, several of which can be subsumed under a choice framework, using conditional logit models from discrete choice and random utility theory. Each new edge is viewed as a ``choice'' made by a node to connect to another node, based on (generic) features of the other nodes available to make a connection. This perspective on network formation unifies existing models such as preferential attachment, triadic closure, and node fitness, which are all special cases, and thereby provides a flexible means for conceptualizing, estimating, and comparing models. The lens of discrete choice theory also provides several new tools for analyzing sUnderstanding why networks form and evolve the way they do is a core goal of many scientific disciplines ranging from the social to the physical sciences. Across these disciplines, many kinds of formation models have been employed, several of which can be subsumed under a choice framework, using conditional logit models from discrete choice and random utility theory. Each new edge is viewed as a ``choice'' made by a node to connect to another node, based on (generic) features of the other nodes available to make a connection. This perspective on network formation unifies existing models such as preferential attachment, triadic closure, and node fitness, which are all special cases, and thereby provides a flexible means for conceptualizing, estimating, and comparing models. The lens of discrete choice theory also provides several new tools for analyzing social network formation. In large network data logit models run into practical and conceptual issues, since large numbers of alternatives make direct inference intractable and the assumptions underlying the logit model cease to be realistic in large graphs. Importance sampling of non-chosen alternatives reduces the data size significantly, while, under the right conditions, preserving consistency of the estimates. A model simplification technique called ``de-mixing'', whereby mixture models are reformulated to operate over disjoint choice sets, reduces mixed logit models to conditional logit models. This opens the door to the other approaches to scalability and provides a new analytical toolkit to understand the underlying processes. The flexibility of the logit framework is illustrated with examples that analyze several synthetic and real-world datasets, including data from Flickr, Venmo and a large citation graph. The logit model provides a rigorous method for estimating preferential attachment models and can separate the effects of preferential attachment and triadic closure. A more substantial application is the identification of the persistent and changing parts of the networking strategies of U.S. college students as they go through their college years. This analysis is done using a rich and large data set of digital social network data from the Facebook platform.ocial network formation. In large network data logit models run into practical and conceptual issues, since large numbers of alternatives makes direct inference intractable and the assumptions underlying the logit model cease to be realistic in large graphs. Importance sampling of non-chosen alternatives reduces the data size significantly, while, under the right conditions, preserving consistency of the estimates. A model simplification technique called ``de-mixing'', whereby mixture models are reformulated to operate over disjoint choice sets, reduces mixed logit models to conditional logit models. This opens the door to the other approaches to scalability and provides a new analytical toolkit to understand the underlying processes. The flexibility of the logit framework is illustrated with examples that analyze several synthetic and real-world datasets, including data from Flickr, Venmo and a large citation graph. The logit model provides a rigorous method for estimating preferential attachment models and can separate the effects of preferential attachment and triadic closure. A more substantial application is the identification of the persistent and changing parts of the networking strategies of U.S. college students as they go through their college years. This analysis is done using a rich and large data set of digital social network data from the Facebook platform.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2021; ©2021
Publication date	2021; 2021
Issuance	monographic
Language	English

Creators/Contributors

Author	Overgoor, Jan Surya
Degree supervisor	Ugander, Johan
Thesis advisor	Ugander, Johan
Thesis advisor	Goel, Sharad, 1977-
Thesis advisor	Goldberg, Amir
Degree committee member	Goel, Sharad, 1977-
Degree committee member	Goldberg, Amir
Associated with	Stanford University, Department of Management Science and Engineering

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Jan Surya Overgoor.
Note	Submitted to the Department of Management Science and Engineering.
Thesis	Thesis Ph.D. Stanford University 2021.
Location	https://purl.stanford.edu/xh962qh7961

Access conditions

License: This work is licensed under a Creative Commons Attribution 3.0 Unported license (CC BY).

Also listed in

View in SearchWorks

Loading usage metrics...