Model selection methods for item response models

Placeholder Show Content

Abstract/Contents

Abstract
How skilled is a student at solving equations? Is a child's development on track? Are males more verbally aggressive, on average, than females? Which criminal offenders are experiencing psychopathy? These questions and many more can be answered by interpreting a statistical model fit to survey or assessment data. Item response theory offers a suite of such models, known as item response models. Fundamentally, item response models view the probability of an individual responding affirmatively (or correctly) to an item as a function of the individual's factors and the item's parameters. How many factors should represent the individual? Should each item have a guessing parameter? What mathematical function links the individual's factors and the item's parameters to the probability? Should individuals from different groups have different item parameters? The many possible answers to each of these questions constitute different item response models. Many item response models are possible for any data set, and different models frequently lead to different conclusions. In the first chapter, I consider model selection in the context of identifying items that may contain bias. I warn against overlooking the model identification problem at the beginning of most methods for detecting potentially biased items. I suggest the following three-step process for flagging potentially biased items: (1) begin by examining raw item response data, (2) compare the results of a variety of methods, and (3) interpret results in light of the possibility of the methods failing. I develop new methods for these steps, including GLIMMER, a graphical method that enables analysts to inspect their raw item response data for potential bias without making strong assumptions. In the second chapter, I advocate for measuring an item response model's fit by how well it predicts out-of-sample data instead of whether the model could have produced the data. The fact that item responses are cross-classified within persons and items complicates this discussion. Accordingly, I consider two separate predictive tasks for a model. The first task, "missing responses prediction, " is for the model to predict the probability of an affirmative response from in-sample persons responding to in-sample items. The second task, "missing persons prediction, " is for the model to predict the vector of responses from an out-of-sample person. I derive a predictive fit metric for each of these tasks and conduct a series of simulation studies to describe their behavior. I use PISA data to demonstrate how to use cross-validation to directly estimate the predictive fit metrics in practice (PISA, 2015). In the third chapter, I develop new methods for comparing multidimensional item response models and apply them to empirically explore early childhood development. Despite both the abundance of theory and the important implications for parenting, teaching, and health practice, there is surprisingly little large-scale empirical work on early childhood development. My coauthors and I combine cross-sectional survey data and longitudinal mobile app data provided by thousands of parents as their children developed to address this gap. In particular, the mobile app data, provided by Kinedu, Inc., is the result of over 10,000 parents repeatedly reporting on their child's achievement of collections of age-specific developmental milestones. We find that multiple factors best represent early child development. For example, a two-factor model where the 1st factor is mainly physical and the 2nd factor is mainly linguistic—better captures developmental variation than a one-factor model. Accordingly, we conclude that measures of developmental variation should move beyond assumptions that differences and progression of children's development can be represented as a homogenous process, and toward multidimensional representations.

Description

Type of resource text
Form electronic resource; remote; computer; online resource
Extent 1 online resource.
Place California
Place [Stanford, California]
Publisher [Stanford University]
Copyright date 2021; ©2021
Publication date 2021; 2021
Issuance monographic
Language English

Creators/Contributors

Author Stenhaug, Ben Alan
Degree supervisor Domingue, Ben
Thesis advisor Domingue, Ben
Thesis advisor Bolt, Daniel
Thesis advisor Frank, Michael
Degree committee member Bolt, Daniel
Degree committee member Frank, Michael
Associated with Stanford University, Graduate School of Education

Subjects

Genre Theses
Genre Text

Bibliographic information

Statement of responsibility Benjamin A. Stenhaug.
Note Submitted to the Graduate School of Education.
Thesis Thesis Ph.D. Stanford University 2021.
Location https://purl.stanford.edu/yt267zd9190

Access conditions

Copyright
© 2021 by Ben Alan Stenhaug
License
This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

Loading usage metrics...