Zigzag coarsenings, mapper stability and gene-network analyses
- This thesis extends the theory and applications of two standard tools of applied algebraic topology, namely zigzag persistence and the MAPPER algorithm. On the theory side, we provide new stability results that support the application of these tools to point-cloud data. On the applications side, we apply these tools toward efficient shape matching of point clouds and for a novel analysis of gene-interaction networks. The levelset zigzag is a special case of zigzag persistence and is a powerful summary of a real-valued function. The stability of the levelset zigzag is one of its most important properties, since it provides justification for its usage in a continuous setting. We extend this stability of levelset zigzags to both zigzags that are in and not in the so-called pyramid family. The latter class are the coarsened levelset zigzags that are developed in this work, primarily to ease the application of levelset zigzags to point cloud data. MAPPER is the standard algorithm to calculate Reeb graphs from point-cloud data. Reeb graphs are graph-based summaries of topological spaces equipped with functions and have extensive applications in shape matching, surface compression and reconstruction, cancer sub-population discovery etc. As such, MAPPER has been successfully used in many applications. However, a missing piece till date is that there have been no guarantees on the stability of or error bounds on MAPPER. In the current work, we propose one notion of what it means for a Reeb graph calculation algorithm to be stable. Using the language of zigzag persistence, we are able to succinctly formalize the difference between calculated and ideal Reeb graphs. A worst case bound on the Reeb graph reconstruction error is then derived under fairly general conditions. A direct application of this result toward shape matching is demonstrated. We also demonstrate how Reeb graphs summarize vital information about datasets, beyond just point clouds. We apply MAPPER to gene expression networks constructed on cancer datasets. This gives us valuable insights about the structure of these networks. We compare our approach favorably to other state of the art methodologies. We outline the various biological insights that stem from our analysis.
|Type of resource
|electronic; electronic resource; remote
|1 online resource.
|Stanford University, Institute for Computational and Mathematical Engineering.
|Statement of responsibility
|Submitted to the Department of Computational and Mathematical Engineering.
|Thesis (Ph.D.)--Stanford University, 2013.
- © 2013 by Aravindakshan Babu
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...