On latent systemic effects in multiple hypotheses

Sun, Yunting; Stanford University, Department of Statistics

On latent systemic effects in multiple hypotheses

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Ffg181ks0498" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: This dissertation deals with two closely related topics of latent systemic effect in multiple hypothesis testing in addition to supplying an overview of the growing literature in the field. The first part aims at searching for associations with a primary variable among a great many candidate variables in high throughput settings. High throughput hypothesis testing can be made difficult by the presence of systemic effects and other latent variables. Dependencies can change the relative ordering of significance levels among hypotheses. We propose a two stage analysis to counter the effects of latent variables on the ranking of hypotheses. Our method, called LEAPP, statistically isolates the latent variables from the primary one. In simulations it gives better ordering of hypotheses than competing methods such as SVA and EIGENSTRAT. For an illustration, we turn to data from the AGEMAP study relating gene expression to age for 16 tissues in the mouse. LEAPP generates rankings with greater consistency across tissues than the rankings attained by the other methods. The second part studies the detection of DNA copy number variation (CNV) across samples. Experimental artifacts, such as local trends, if not carefully removed, may be misconstrued as significant recurrent regions. We develop an alternating algorithm to adjust the effects of latent variables on the detection of recurring CNVs. Our method, called CNVlatent, improves accuracy in detecting CNVs for simulated data compared to methods without adjustments for latent effects. We resort to two data sets for illustration. One is from the chromosome 9p region in 44 pediatric leukemia samples and the other is from a region on cytoband 11 on the q-arm of chromosome 22. CNVlatent successfully detects visible copy number changes and adjusts for the latent effects. There are many studies regarding segmentation of CNVs, but incorporating copy number information into association tests remains an open problem for lack of accuracy of copy number genotyping. We proposed a statistical framework for genotyping CNVs on a detected genomic region encompassing the putative CNVs in the analysis of both inherited and somatic copy number variants. To pool information across SNPs, we take into account the different response rates and noise properties of each SNP. We carry out the model calibration with an Expectation-Maximization (EM) based algorithm. Our method achieves higher estimation precision in synthetic data and generate estimators with greater consistency across a data set with replicate samples than existing methods such as CNVtools.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2011
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Sun, Yunting
Associated with	Stanford University, Department of Statistics
Primary advisor	Owen, Art B
Primary advisor	Zhang, Nancy R. (Nancy Ruonan)
Thesis advisor	Owen, Art B
Thesis advisor	Zhang, Nancy R. (Nancy Ruonan)
Thesis advisor	Efron, Bradley
Thesis advisor	Wong, Wing Hung
Advisor	Efron, Bradley
Advisor	Wong, Wing Hung

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	Yunting Sun.
Note	Submitted to the Department of Statistics.
Thesis	Thesis (Ph.D.)--Stanford University, 2011.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...