Topics in two-sample testing

Placeholder Show Content

Abstract/Contents

Abstract
Driven by recent advances in the collection of biological data, many such studies draw from heterogeneous datasources. We develop an idea of Jerome Friedman's to conduct two-sample testing using supervised learning procedures. In special cases, this technique generalizes the randomization t-test, for which an asymptotic normality result is known. Using Stein's method of exchangeable pairs, we produce Berry--Esseen-type bounds for the permutation t-statistic for the purpose of statistical inference. We demonstrate the use of kernel methods in two-sample testing on non-vectorial data (text and images), and apply multiple kernel learning (MKL) to the heterogeneous data domain. We show that these techniques can effectively synthesize signals from multiple datasources and produce interpretable weights that highlight the role of each component.

Description

Type of resource text
Form electronic; electronic resource; remote
Extent 1 online resource.
Publication date 2013
Issuance monographic
Language English

Creators/Contributors

Associated with Ray, Nelson C
Associated with Stanford University, Department of Statistics.
Primary advisor Friedman, J. H. (Jerome H.)
Primary advisor Holmes, Susan, 1954-
Thesis advisor Friedman, J. H. (Jerome H.)
Thesis advisor Holmes, Susan, 1954-
Thesis advisor Diaconis, Persi
Thesis advisor Efron, Bradley
Advisor Diaconis, Persi
Advisor Efron, Bradley

Subjects

Genre Theses

Bibliographic information

Statement of responsibility Nelson C. Ray.
Note Submitted to the Department of Statistics.
Thesis Thesis (Ph.D.)--Stanford University, 2013.
Location electronic resource

Access conditions

Copyright
© 2013 by Nelson Chan Ray
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...