Scalable estimation and inference for massive linear mixed models with crossed random effects

Gao, Katelyn; Stanford University, Department of Statistics.

Scalable estimation and inference for massive linear mixed models with crossed random effects

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Ftc942zh5481" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: With modern electronic activity, large crossed data sets are increasingly common, with factors such as users and items. It is often appropriate to model them with crossed random effects, since specific levels are temporary. Their size provides challenges for statistical analysis. For such large data sets, the computational costs of estimation and inference (time, space, and communication) should grow at most linearly with the sample size and the algorithms should be parallelizable. Both traditional maximum likelihood estimation and numerous Markov chain Monte Carlo Bayesian algorithms take superlinear time in order to obtain good parameter estimates in the simple two-factor crossed random effects model and linear mixed model with two crossed random effects. We propose moment based, parallelizable algorithms that, with at most linear cost, estimate regression coefficients and variance components and measure the uncertainties of those estimates. These estimates are consistent and asymptotically Gaussian. When run on simulated normally distributed data, our algorithms perform competitively with maximum likelihood methods. We apply the algorithms to some real-world data from Stitch Fix where the crossed random effects correspond to clients and items. The random effects analysis is able to account for the increased variance due to intra-client and intra-item correlations in the data, but ignoring the correlation structure can lead to standard error underestimates of over 10-fold.

Description

Type of resource	text
Form	electronic; electronic resource; remote
Extent	1 online resource.
Publication date	2017
Issuance	monographic
Language	English

Creators/Contributors

Associated with	Gao, Katelyn
Associated with	Stanford University, Department of Statistics.
Primary advisor	Owen, Art B
Thesis advisor	Owen, Art B
Thesis advisor	Mackey, Lester
Thesis advisor	Tibshirani, Robert
Advisor	Mackey, Lester
Advisor	Tibshirani, Robert

Subjects

Genre	Theses

Bibliographic information

Statement of responsibility	Katelyn Gao.
Note	Submitted to the Department of Statistics.
Thesis	Thesis (Ph.D.)--Stanford University, 2017.
Location	electronic resource

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...