Essays on statistical learning and causal inference on panel data

Placeholder Show Content

Abstract/Contents

Abstract
Panel data that provides multiple observations on each individual over time has become widely available and received growing interests in many domains. For example, in asset pricing, panel data on asset returns over time is central in the study of how financial assets, such as stocks, bonds, and futures, are priced. In public policy, panel data is valuable in estimating and analyzing economic and social policies' effects. Panel data can improve the power of analyses, uncover dynamic relationships of variables, and generate more accurate predictions for individual outcomes. The growing interests of panel data in empirical research have proliferated the studies of new methodologies. In the first part of this thesis, we demonstrate several novel statistical inference methods on large-dimensional panel data with a large number of units and time periods. An effective method to summarize the information in large-dimensional panel data is the factor model that has been successfully used in asset pricing, recommendation systems, and many other topics. We focus on the latent factor models where the factors are unobserved and estimated from the data. Latent factor models can address the model misspecification concern, i.e., we could not fully observe all covariates that affect the outcome. However, latent factors are hard to interpret because they are usually the weighted average of all units. We propose sparse proximate factors for latent factors. Sparse proximate factors are constructed from a few units with the largest signal-to-noise ratio that can approximate latent factors well while being interpretable. When the panel data spans a long time horizon, such as macroeconomic data, it is restrictive to assume the factor structure is static. We generalize the factor structure to depend on some observed state process. For example, the factor model in stock return data can change with the business cycle. We provide an estimator for this state-varying factor model and develop its inferential theory. Many studies in social sciences and healthcare try to answer questions about causal relationships beyond statistical analysis. Many of these studies rely on observational data when running experiments is infeasible, and observational panel data has received more attention because panel data can capture the changes with units over time. A fundamental question to estimate the causal effects from observational data is to estimate the counterfactual outcomes that can be modeled as the missing observations. We connect large-dimensional factor modeling with causal inference. Specifically, we provide an estimator for the latent factor model on large-dimensional panel data with missing observations and derive the inferential theory for our estimator that can be used to test the effect of a treatment at any time and general weighted treatments. An alternative approach to study the treatment effect is to run experiments, which is the gold standard in medical and clinical research and has become increasingly popular to test new products in large technology companies. In the second part of this thesis, we study optimal multi-period experimental design to increase the statistical power, which is a common hurdle in designing experiments. We show that the structure of the multi-period experimental design depends on how long the effect of interventions last. In the presence of pre-experimentation data, we can further optimize our treatment designs and hence reduce the number of required samples, which means lowering the experiment cost

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2020; ©2020
Publication date 2020; 2020
Issuance monographic
Language English

Creators/Contributors

Author Xiong, Ruoxuan
Degree supervisor Pelger, Markus
Thesis advisor Pelger, Markus
Thesis advisor Athey, Susan
Thesis advisor Bayati, Mohsen
Degree committee member Athey, Susan
Degree committee member Bayati, Mohsen
Associated with Stanford University, Department of Management Science and Engineering

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Ruoxuan Xiong
Note Submitted to the Department of Management Science and Engineering
Thesis Thesis Ph.D. Stanford University 2020
Location electronic resource

Access conditions

Copyright
© 2020 by Ruoxuan Xiong
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...