Towards expressive and scalable deep representation learning for graphs

Placeholder Show Content

Abstract/Contents

Abstract
The ubiquity of graph structures in sciences and industry necessitates effective and scalable machine learning models capable of capturing the underlying inductive biases of the relational data. However, traditional representation learning algorithms on graph structure faces many limitations. Firstly, traditional methods including matrix factorization and distributed embeddings cannot scale to large real-world graphs with billions of nodes and edges due to their sizes of parameter space. Secondly, they lack expressiveness compared to recent advances of deep learning architectures. Lastly, they fail in inductive scenarios where they required to make prediction on nodes unseen during training. Finally, interpretation of what model learns from data is elusive to domain experts. In this thesis I present a series of work that pioneers the use of graph neural networks (GNNs) to tackle the challenges of representation learning on graphs in the aspects of explainability, scalability, and expressiveness. In the first part, I demonstrate my framework of GraphSAGE as a general but powerful overarching graph neural network framework. To tackle the challenge of model interpretability with the new GraphSAGE framework, I further introduce an extension model to obtain meaningful explanations from the trained graph neural network model. Under the framework of GraphSAGE, the second part presents a series of works that improves the expressive power of GNNs through the use of hierarchical structure, geometric embedding space, as well as multi-hop attention. These GNN-based architectures achieved unprecedented performance improvement over traditional methods on tasks in a variety of contexts, such as graph classification for molecules, hierarchical knowledge graphs and large-scale citation networks. In the third part, I further demonstrate a variety of applications of GNNs. Based on GraphSAGE, I developed PinSAGE, the first deployed GNN model that scales to billion-sized graphs. PinSAGE is deployed at Pinterest, to make recommendations for billions of users at Pinterest. In the area of grahics and simulations, we apply expressive architectures to accurately predict the physics of different materials and allow generalization to unseen dynamic systems. Finally, I discuss BiDyn, a dynamic GNN model for abuse detection before concluding the thesis.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Ying, Zhitao
Degree supervisor Leskovec, Jurij
Thesis advisor Leskovec, Jurij
Thesis advisor Ré, Christopher
Thesis advisor Yamins, Daniel
Degree committee member Ré, Christopher
Degree committee member Yamins, Daniel
Associated with Stanford University, Computer Science Department

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Rex (Zhitao) Ying.
Note Submitted to the Computer Science Department.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/qh673zw4863

Access conditions

Copyright
© 2021 by Zhitao Ying
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...