US Youths Reporting LGB Identity: 2019 State-Level Population Estimation with Complex Missingness
Abstract/Contents
- Abstract
- Lesbian, gay, and bisexual (LGB) youths are at higher risk of experiencing violence and developing mental health disorders. Accurately estimating the proportion of high school students identifying as LGB in each US state is crucial for disease rate calculations, school interventions, and resource allocation decisions. One essential measure to estimate the LGB population size is self-reported sexual identity. The Youth Risk Behavior Survey (YRBS) collects data on health related behaviors among high school students in most US states, including self-reported sexual identity. However, missing data make it challenging to estimate the state-level LGB population sizes accurately. To address this issue, we studied the complex missingness and proposed using the Heckman selection model to impute missing data in self reported sexual identity as we assumed a missing-not-at-random (MNAR) mechanism. As the Heckman selection model requires exclusion-restriction criteria, essentially including an instrumental variable in the selection equation but excluded from the outcome equation, we proposed a framework to identify valid and strong instruments, which is specifically tailored to datasets with binary outcomes and a large number of categorical independent variables. To meet the exclusion-restriction criteria, we developed 14 strategies to construct candidates for instruments and/or conduct data transformation for all independent variables. We were unable to find enough valid and strong instruments for most states using any of the strategies. We differentiated strategies based on the number of states that had valid and strong instrumental variables to fit the Heckman selection model and generate imputations. Even for the strategy that performed best in terms of these criteria, model performance was relatively poor as the estimates of the correlation coefficient between the error terms of the two stages in the Heckman selection model were statistically insignificant (p value being 1). We discussed limitations with respect to both data and methods and suggested more future work is needed. Overall, accurately estimating the proportion of LGB adolescents in each state remains a critical challenge.
Description
Type of resource | text |
---|---|
Publication date | June 5, 2023; June 2, 2023 |
Creators/Contributors
Author | Zhao, Jiayi |
---|---|
Degree granting institution | Stanford University |
Department | Department of Health Policy |
Funder | CDC DASH |
Thesis advisor | Rose, Sherri |
Research team head | Salomon, Joshua A. |
Researcher | Jahagirdar, Deepa |
Subjects
Subject | Imputation |
---|---|
Subject | MNAR |
Subject | Heckman |
Subject | LGB |
Subject | YRBS |
Genre | Text |
Genre | Thesis |
Bibliographic information
Access conditions
- Use and reproduction
- User agrees that, where applicable, content will not be used to identify or to otherwise infringe the privacy or confidentiality rights of individuals. Content distributed via the Stanford Digital Repository may be subject to additional license and use restrictions applied by the depositor.
- License
- This work is licensed under a Creative Commons Attribution 4.0 International license (CC BY).
Preferred citation
- Preferred citation
- Zhao, J. (2023). US Youths Reporting LGB Identity: 2019 State-Level Population Estimation with Complex Missingness. Stanford Digital Repository. Available at https://purl.stanford.edu/jg930km8025. https://doi.org/10.25740/jg930km8025.
Collection
Health Policy Masters Theses
View other items in this collection in SearchWorksContact information
Loading usage metrics...