Topics in multivariate density estimation and its applications

Placeholder Show Content

Abstract/Contents

Abstract
This thesis centers around partition based multivariate density estimation and its applications to bioinformatics, data visualisation and two-sample divergence estimation. It is formally divided into three parts: i. Optional P\'{o}lya Tree \citep{Wong2010} and Sequential Bayesian Partitioning \citep{Lu2013} is very computationally intense and face challenges in higher dimensional applications. Since the density is uniform conditioned on each sub-region for the piecewise density function, we attempt to control the uniformity in each sub-region directly. Discrepancy in Quasi-Monte Carlo provides a natural way to control the uniformity quantitatively as well as a theoretical framework on the estimated density function. We demonstrate that our new method is computationally more attractive and the bounds derived are tight. We also apply it to Flow Cytometry analysis and multivariate data visualisation. ii. The original density estimate obtained from Optional P\'{o}lya Tree or Sequential Bayesian Partitioning or Discrepancy is a piecewise constant function supported on binary partitions. However, some applications require a continuous density estimation on the samples. Inspired by the Finite Element Method in numeric partial differential equations, we triangulate the domain and construct a piecewise linear density function with the linear basis (i.e., first order basis). This construction is further recast to a quadratic programming problem which can be effectively solved by optimization packages. iii. Divergence such as KL divergence plays an important role in informatics and statistics \citep{Nguyen2010, Wang2005}. We extend our partition based density estimation to two-sample case and construct a partition capable of capturing the difference between them. We demonstrate that our method provides a unified way to estimate three classes of divergences (KL divergence, Total variation distance as special cases) and achieve good convergence. Some higher dimensional examples are also tested, which are rare in previous research papers.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2015
Issuance monographic
Language English

Creators/Contributors

Associated with Yang, Kun
Associated with Stanford University, Institute for Computational and Mathematical Engineering.
Primary advisor Wong, Wing Hung
Thesis advisor Wong, Wing Hung
Thesis advisor Owen, Art B
Thesis advisor Ying, Lexing
Advisor Owen, Art B
Advisor Ying, Lexing

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Kun Yang.
Note Submitted to the Institute for Computational and Mathematical Engineering.
Thesis Thesis (Ph.D.)--Stanford University, 2015.
Location electronic resource

Access conditions

Copyright
© 2015 by Kun Yang
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...