Model selection methods for item response models
Abstract/Contents
- Abstract
- How skilled is a student at solving equations? Is a child's development on track? Are males more verbally aggressive, on average, than females? Which criminal offenders are experiencing psychopathy? These questions and many more can be answered by interpreting a statistical model fit to survey or assessment data. Item response theory offers a suite of such models, known as item response models. Fundamentally, item response models view the probability of an individual responding affirmatively (or correctly) to an item as a function of the individual's factors and the item's parameters. How many factors should represent the individual? Should each item have a guessing parameter? What mathematical function links the individual's factors and the item's parameters to the probability? Should individuals from different groups have different item parameters? The many possible answers to each of these questions constitute different item response models. Many item response models are possible for any data set, and different models frequently lead to different conclusions. In the first chapter, I consider model selection in the context of identifying items that may contain bias. I warn against overlooking the model identification problem at the beginning of most methods for detecting potentially biased items. I suggest the following three-step process for flagging potentially biased items: (1) begin by examining raw item response data, (2) compare the results of a variety of methods, and (3) interpret results in light of the possibility of the methods failing. I develop new methods for these steps, including GLIMMER, a graphical method that enables analysts to inspect their raw item response data for potential bias without making strong assumptions. In the second chapter, I advocate for measuring an item response model's fit by how well it predicts out-of-sample data instead of whether the model could have produced the data. The fact that item responses are cross-classified within persons and items complicates this discussion. Accordingly, I consider two separate predictive tasks for a model. The first task, "missing responses prediction, " is for the model to predict the probability of an affirmative response from in-sample persons responding to in-sample items. The second task, "missing persons prediction, " is for the model to predict the vector of responses from an out-of-sample person. I derive a predictive fit metric for each of these tasks and conduct a series of simulation studies to describe their behavior. I use PISA data to demonstrate how to use cross-validation to directly estimate the predictive fit metrics in practice (PISA, 2015). In the third chapter, I develop new methods for comparing multidimensional item response models and apply them to empirically explore early childhood development. Despite both the abundance of theory and the important implications for parenting, teaching, and health practice, there is surprisingly little large-scale empirical work on early childhood development. My coauthors and I combine cross-sectional survey data and longitudinal mobile app data provided by thousands of parents as their children developed to address this gap. In particular, the mobile app data, provided by Kinedu, Inc., is the result of over 10,000 parents repeatedly reporting on their child's achievement of collections of age-specific developmental milestones. We find that multiple factors best represent early child development. For example, a two-factor model where the 1st factor is mainly physical and the 2nd factor is mainly linguistic—better captures developmental variation than a one-factor model. Accordingly, we conclude that measures of developmental variation should move beyond assumptions that differences and progression of children's development can be represented as a homogenous process, and toward multidimensional representations.
Description
Type of resource | text |
---|---|
Form | electronic resource; remote; computer; online resource |
Extent | 1 online resource. |
Place | California |
Place | [Stanford, California] |
Publisher | [Stanford University] |
Copyright date | 2021; ©2021 |
Publication date | 2021; 2021 |
Issuance | monographic |
Language | English |
Creators/Contributors
Author | Stenhaug, Ben Alan |
---|---|
Degree supervisor | Domingue, Ben |
Thesis advisor | Domingue, Ben |
Thesis advisor | Bolt, Daniel |
Thesis advisor | Frank, Michael |
Degree committee member | Bolt, Daniel |
Degree committee member | Frank, Michael |
Associated with | Stanford University, Graduate School of Education |
Subjects
Genre | Theses |
---|---|
Genre | Text |
Bibliographic information
Statement of responsibility | Benjamin A. Stenhaug. |
---|---|
Note | Submitted to the Graduate School of Education. |
Thesis | Thesis Ph.D. Stanford University 2021. |
Location | https://purl.stanford.edu/yt267zd9190 |
Access conditions
- Copyright
- © 2021 by Ben Alan Stenhaug
- License
- This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).
Also listed in
Loading usage metrics...