New methods for variable importance testing with applications to genetic studies

Placeholder Show Content

Abstract/Contents

Abstract
The objective of this thesis is to develop new practical and principled statistical methodology for the analysis of genome-wide association data, in order to identify, as precisely as possible, the genetic variants that affect complex phenotypes. This problem can be stated as one of testing multiple hypotheses of conditional independence between many possible explanatory variables and a response of interest, within a high-dimensional non-parametric regression setting. This dissertation builds upon previous work on knockoffs, which provides a general framework for addressing such variable importance testing problems. In particular, we study how to generate valid knockoffs for genetic variants, while taking into account the particular structure of these data and the hidden Markov models developed by geneticists to describe their distribution. As a result, we can obtain an effective and statistically rigorous tool for genetic mapping that controls the false discovery rate under minimal assumptions, while overcoming many of the limitations of the existing state-of-the-art methods. Extensive numerical experiments with genetic data confirm the empirical validity and effectiveness of our method, while applications to the analysis of large data sets lead to many new discoveries

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2020; ©2020
Publication date 2020; 2020
Issuance monographic
Language English

Creators/Contributors

Author Sesia, Matteo
Degree supervisor Candès, Emmanuel J. (Emmanuel Jean)
Thesis advisor Candès, Emmanuel J. (Emmanuel Jean)
Thesis advisor Sabatti, Chiara
Thesis advisor Tibshirani, Robert
Degree committee member Sabatti, Chiara
Degree committee member Tibshirani, Robert
Associated with Stanford University, Department of Statistics.

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Matteo Sesia
Note Submitted to the Department of Statistics
Thesis Thesis Ph.D. Stanford University 2020
Location electronic resource

Access conditions

Copyright
© 2020 by Matteo Sesia
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...