Model selection methods for item response models

Stenhaug, Ben Alan

Model selection methods for item response models

<a href="https://embed.stanford.edu/iframe/?url=https%3A%2F%2Fpurl.stanford.edu%2Fyt267zd9190" class="su-underline">Show Content</a>

Abstract/Contents

Abstract: How skilled is a student at solving equations? Is a child's development on track? Are males more verbally aggressive, on average, than females? Which criminal offenders are experiencing psychopathy? These questions and many more can be answered by interpreting a statistical model fit to survey or assessment data. Item response theory offers a suite of such models, known as item response models. Fundamentally, item response models view the probability of an individual responding aﬀirmatively (or correctly) to an item as a function of the individual's factors and the item's parameters. How many factors should represent the individual? Should each item have a guessing parameter? What mathematical function links the individual's factors and the item's parameters to the probability? Should individuals from different groups have different item parameters? The many possible answers to each of these questions constitute different item response models. Many item response models are possible for any data set, and different models frequently lead to different conclusions. In the first chapter, I consider model selection in the context of identifying items that may contain bias. I warn against overlooking the model identification problem at the beginning of most methods for detecting potentially biased items. I suggest the following three-step process for flagging potentially biased items: (1) begin by examining raw item response data, (2) compare the results of a variety of methods, and (3) interpret results in light of the possibility of the methods failing. I develop new methods for these steps, including GLIMMER, a graphical method that enables analysts to inspect their raw item response data for potential bias without making strong assumptions. In the second chapter, I advocate for measuring an item response model's fit by how well it predicts out-of-sample data instead of whether the model could have produced the data. The fact that item responses are cross-classified within persons and items complicates this discussion. Accordingly, I consider two separate predictive tasks for a model. The first task, "missing responses prediction, " is for the model to predict the probability of an aﬀirmative response from in-sample persons responding to in-sample items. The second task, "missing persons prediction, " is for the model to predict the vector of responses from an out-of-sample person. I derive a predictive fit metric for each of these tasks and conduct a series of simulation studies to describe their behavior. I use PISA data to demonstrate how to use cross-validation to directly estimate the predictive fit metrics in practice (PISA, 2015). In the third chapter, I develop new methods for comparing multidimensional item response models and apply them to empirically explore early childhood development. Despite both the abundance of theory and the important implications for parenting, teaching, and health practice, there is surprisingly little large-scale empirical work on early childhood development. My coauthors and I combine cross-sectional survey data and longitudinal mobile app data provided by thousands of parents as their children developed to address this gap. In particular, the mobile app data, provided by Kinedu, Inc., is the result of over 10,000 parents repeatedly reporting on their child's achievement of collections of age-specific developmental milestones. We find that multiple factors best represent early child development. For example, a two-factor model where the 1st factor is mainly physical and the 2nd factor is mainly linguistic—better captures developmental variation than a one-factor model. Accordingly, we conclude that measures of developmental variation should move beyond assumptions that differences and progression of children's development can be represented as a homogenous process, and toward multidimensional representations.

Description

Type of resource	text
Form	electronic resource; remote; computer; online resource
Extent	1 online resource.
Place	California
Place	[Stanford, California]
Publisher	[Stanford University]
Copyright date	2021; ©2021
Publication date	2021; 2021
Issuance	monographic
Language	English

Creators/Contributors

Author	Stenhaug, Ben Alan
Degree supervisor	Domingue, Ben
Thesis advisor	Domingue, Ben
Thesis advisor	Bolt, Daniel
Thesis advisor	Frank, Michael
Degree committee member	Bolt, Daniel
Degree committee member	Frank, Michael
Associated with	Stanford University, Graduate School of Education

Subjects

Genre	Theses
Genre	Text

Bibliographic information

Statement of responsibility	Benjamin A. Stenhaug.
Note	Submitted to the Graduate School of Education.
Thesis	Thesis Ph.D. Stanford University 2021.
Location	https://purl.stanford.edu/yt267zd9190

Access conditions

License: This work is licensed under a Creative Commons Attribution Non Commercial 3.0 Unported license (CC BY-NC).

Also listed in

View in SearchWorks

Loading usage metrics...