Item Response Theory. Edition No. 1


Book
384 Pages
August 2021
John Wiley and Sons Ltd
ID: 5840132

A complete discussion of fundamental and advanced topics in Item Response Theory written by pioneers in the field

In Item Response Theory, accomplished psychometricians Darrell Bock and Robert Gibbons deliver a comprehensive and up-to-date exploration of the theoretical foundations and applications of Item Response Theory (IRT). Covering both unidimensional and multidimensional IRT, as well as related adaptive test administration of previously calibrated item banks, the book addresses the growing need for understanding of this topic as the use of IRT spreads to other fields.

The first book on the topic that offers a complete and unified treatment of its subject, Item Response Theory prepares researchers and students to understand and apply IRT and multidimensional IRT to fields like education, mental health and marketing. Accessible to first year-graduate students with a foundation in the behavioral or social sciences, basic statistics, and generalized linear models, the book walks readers through everything from the logic of IRT to cutting edge applications of the technique.

Readers will also benefit from the inclusion of:

• A thorough introduction to the foundations of Item Response Theory, including its logic and origins, model-based measurement, psychological scaling, and classical test theory

• An exploration of selected mathematical and statistical results, including points, point sets, and set operations, probability, sampling, and joint, conditional, and marginal probability

• Discussions of unidimensional and multidimensional IRT models, including item parameter estimation with binary and polytomous data

• Analysis of dimensionality, differential item functioning, and multiple group IRT

Perfect for graduate students and researchers studying and working with psychometrics in psychology, quantitative psychology, educational measurement, marketing, and statistics, Item Response Theory will also benefit researchers interested in patient reported outcomes in health research.

Preface xvii

Acknowledgments xix

1 Foundations 1

1.1 The Logic of Item Response Theory 3

1.2 Model-based Data Analysis 4

1.3 Origins 5

1.3.1 Psychometric Scaling 6

1.3.2 Classical Test Theory 9

1.3.3 Contributions fromStatistics 10

1.4 The Population Concept in IRT 11

1.5 Generalizability Theory 14

2 Selected Mathematical and Statistical Results 21

2.1 Points, Point Sets, and Set Operations 21

2.2 Probability 24

2.3 Sampling 25

2.4 Joint, Conditional, and Marginal Probability 26

2.5 Probability Distributions and Densities 28

2.6 Describing Distributions 32

2.7 Functions of RandomVariables 34

2.7.1 Linear Functions 34

2.7.2 Nonlinear Functions 37

2.8 Elements ofMatrix Algebra 37

2.8.1 PartitionedMatrices 41

2.8.2 The Kronecker Product 42

2.8.3 Row and ColumnMatrices 43

2.8.4 Matrix Inversion 43

2.9 Determinants 45

2.10 Matrix Differentiation 45

2.10.1 Scalar Functions of Vector Variables 46

2.10.2 Vector Functions of a Vector Variable 47

2.10.3 Scalar Functions of aMatrix Variable 48

2.10.4 Chain Rule for Scalar Functions of a Matrix Variable 49

2.10.5 Matrix Functions of aMatrix Variable 49

2.10.6 Derivatives of a Scalar Function with Respect to a SymmetricMatrix 50

2.10.7 Second-order Differentiation 52

2.11 Theory of Estimation 53

2.11.1 Analysis of Variance 56

2.11.2 Estimating VarianceComponents 57

2.12 MaximumLikelihoodEstimation (MLE) 59

2.12.1 Likelihood Functions 59

2.12.2 The LikelihoodEquations 60

2.12.3 Examples of Maximum Likelihood Estimation 60

2.12.4 SamplingDistribution of the Estimator 62

2.12.5 The Fisher-scoring Solution of the Likelihood Equations 63

2.12.6 Properties of the Maximum Likelihood Estimator (MLE) 63

2.12.7 Constrained Estimation 64

2.12.8 Admissibility 64

2.13 Bayes Estimation 65

2.14 TheMaximumA Posteriori (MAP) Estimator 68

2.15 Marginal Maximum Likelihood Estimation (MMLE) 69

2.15.1 TheMarginal Likelihood Equations 70

2.15.2 Application in the “Normal-Normal” Case 72

2.15.3 The EMSolution 75

2.15.4 The Fisher-scoring Solution 75

2.16 Probit and LogitAnalysis 77

2.16.1 ProbitAnalysis 77

2.16.2 LogitAnalysis 79

2.16.3 Logit-linearAnalysis 80

2.16.4 Extension of Logit-linear Analysis to Multinomial Data 82

2.16.4.1 Graded Categories 83

2.16.4.2 NominalCategories 85

2.17 SomeResults fromClassical Test Theory 88

2.17.1 Test Reliability 90

2.17.2 Estimating Reliability 91

2.17.2.1 Bayes Estimation of True Scores 96

2.17.3 When are the Assumptions of Classical Test Theory Reasonable? 97

3 Unidimensional IRT Models 101

3.1 The General IRT Framework 103

3.2 Item ResponseModels 104

3.2.1 DichotomousCategories 105

3.2.1.1 Normal OgiveModel 105

3.2.1.2 2-PLModel 109

3.2.1.3 3-PLModel 111

3.2.1.4 1-PLModel 113

3.2.1.5 Illustration 114

3.2.2 PolytomousCategories 115

3.2.2.1 Graded CategoriesModel 118

3.2.2.2 Illustration 120

3.2.2.3 The NominalCategoriesModel 122

3.2.2.4 Nominal Multiple-Choice Model 130

3.2.2.5 Illustration 132

3.2.2.6 Partial CreditModel 135

3.2.2.7 Generalized Partial Credit Model 136

3.2.2.8 Illustration 136

3.2.2.9 Rating ScaleModels 136

3.2.3 RankingModel 139

4 Item Parameter Estimation - Binary Data 141

4.1 Estimation of Item Parameters Assuming Known Attribute

Values of the Respondents 142

4.1.1 Estimation 143

4.1.1.1 The 1-parameterModel 143

4.1.1.2 The 2-parameterModel 144

4.1.1.3 The 3-parameterModel 145

4.2 Estimation of Item Parameters Assuming Unknown Attribute Values of the Respondents 146

4.2.1 Joint Maximum Likelihood Estimation (JML) 147

4.2.1.1 The 1-parameter Logistic Model 147

4.2.1.2 Logit-linearAnalysis 148

4.2.1.3 Proportional Marginal Adjustments 153

4.2.2 Marginal Maximum Likelihood Estimation (MML) 158

4.2.2.1 The 2-parameterModel 162

5 Item Parameter Estimation - Polytomous Data 177

5.1 General Results 177

5.2 The Normal OgiveModel 182

5.3 The NominalCategoriesModel 183

5.4 The Graded CategoriesModel 185

5.5 The Generalized Partial Credit Model 188

5.5.1 The Unrestricted Version 189

5.5.2 The EMSolution 190

5.5.2.1 The GPCM Newton-Gauss Joint Solution 191

5.5.3 Rating ScaleModels 191

5.5.3.1 The EMSolution for the RSM 192

5.5.3.2 The Newton-Gauss Solution for the RSM 193

5.6 Boundary Problems 194

5.7 MultipleGroupModels 196

5.8 Discussion 197

5.9 Conclusions 200

6 Multidimensional IRT Models 201

6.1 Classical Multiple Factor Analysis of Test Scores 202

6.2 Classical Item Factor Analysis 203

6.3 Item Factor Analysis Based on Item Response Theory 205

6.4 Maximum Likelihood Estimation of Item Slopes and Intercepts 206

6.4.1 Estimating Parameters of the Item Response Model 208

6.5 Indeterminacies of Item Factor Analysis 212

6.5.1 Direction of Response 212

6.5.2 Indeterminacy of Location and Scale 212

6.5.3 Rotational Indeterminacy of Factor Loadings in exploratory Factor Analysis 213

6.5.3.1 Varimax Factor Pattern 214

6.5.3.2 Promax Factor Pattern 214

6.5.3.3 General andGroup Factors 215

6.5.3.4 Confirmatory Item Factor Analysis and the Bifactor Pattern 215

6.6 Estimation of Item Parameters and Respondent Scores in Item Bifactor Analysis 218

6.7 Estimating Factor Scores 219

6.8 Example 220

6.8.1 Exploratory Item Factor Analysis 221

6.8.2 Confirmatory Item Bifactor Analysis 223

6.9 Two-tierModel 227

6.10 Summary 230

7 Analysis of Dimensionality 233

7.1 Unidimensional Models and Multidimensional Data 234

7.2 Limited-InformationGoodness of Fit Tests 237

7.3 Example 240

7.3.1 Exploratory Item Factor Analysis 240

7.3.2 Confirmatory Item Bifactor Analysis 241

7.4 Discussion 242

8 Computerized Adaptive Testing 243

8.1 What is Computerized AdaptiveTesting? 243

8.2 Computerized Adaptive Testing - An Overview 244

8.3 Item Selection 245

8.3.1 UnidimensionalComputerized Adaptive Testing (UCAT) 246

8.3.1.1 Fisher Information in IRT Model 246

8.3.1.2 Maximizing Fisher Information (MFI) and Its Limitations 248

8.3.1.3 Modifications toMFI 249

8.3.2 MultidimensionalComputerized Adaptive Testing (MCAT) 251

8.3.2.1 Two Conceptualizations of the Information Function in Multidimensional Space 252

8.3.2.2 SelectionMethods inMCAT 253

8.3.3 Bifactor IRT 256

8.4 Terminating an Adaptive Test 257

8.5 AdditionalConsiderations 258

8.6 An Example fromMental HealthMeasurement 260

8.6.1 The CAT-Mental Health 261

8.6.2 Discussion 264

9 Differential Item Functioning 267

9.1 Introduction 267

9.2 Types of DIF 268

9.3 TheMantel-Haenszel Procedure 270

9.4 Lord’sWald Test 271

9.5 LagrangeMultiplier Test 272

9.6 LogisticRegression 273

9.7 Assessing DIF for the BifactorModel 275

9.8 Assessing DIF fromCATData 276

10 Estimating Respondent Attributes 279

10.1 Introduction 279

10.2 Ability Estimation 279

10.2.1 MaximumLikelihood280

10.2.2 BayesMAP 281

10.2.3 Bayes EAP 281

10.2.4 Ability Estimation for Polytomous data 282

10.2.5 Ability Estimation for Multidimensional IRT Models 283

10.2.6 Ability Estimation for the Bifactor Model 284

10.2.7 Estimation of the Ability Distribution 284

10.2.8 Domain Scores 285

11 Multiple Group Item Response Models 287

11.1 Introduction 287

11.2 IRT Estimation when the Grouping Structure is Known: TraditionalMultipleGroup

IRT 288

11.2.1 Example 291

11.3 IRT Estimation when the Grouping Structure is Unknown: Mixtures of Gaussian Components 292

11.3.1 TheMixture Distribution 293

11.3.2 The LikelihoodComponent 295

11.3.3 Algorithm 296

11.3.4 Unequal Variances 297

11.4 MultivariateProbit Analysis 297

11.4.1 TheModel 299

11.4.2 Identification 300

11.4.3 Estimation 300

11.4.4 Tests of Fit 301

11.4.5 Illustration 302

11.5 Multilevel IRTModels 306

11.5.1 The RaschModel 306

11.5.2 The Two-parameter LogisticModel 308

11.5.3 Estimation 308

11.5.4 Illustration 309

12 Test and Scale Development and Maintenance 311

12.1 Introduction 311

12.2 Item Banking 311

12.3 Item Calibration 314

12.3.1 The OEMMethod 315

12.3.2 TheMEMMethod 315

12.3.3 Stocking’sMethod A 315

12.3.4 Stocking’sMethod B 316

12.4 IRT Equating 318

12.4.1 Linking, Scale Aligning and Equating 318

12.4.2 Experimental Designs for Equating 319

12.4.2.1 SingleGroup (SG)Design 319

12.4.2.2 Equivalent Groups (EG) Design 319

12.4.2.3 Counterbalanced (CB) Design 319

12.4.2.4 The Anchor Test or Nonequivalent Groups with Anchor Test (NEAT) Design 319

12.5 Harmonization 320

12.6 Item Parameter Drift 322

12.7 Summary 323

13 Some Interesting Applications 325

13.1 Introduction 325

13.2 Bio-behavioral Synthesis 325

13.3 Mental HealthMeasurement 328

13.3.1 The CAT-Depression Inventory 328

13.3.2 The CAT-Anxiety Scale 330

13.3.3 The Measurement of Suicidality and the Prediction of Future Suicidal Attempt 331

13.3.4 Clinician and Self-rated Psychosis Measurement 332

13.3.5 Substance Use Disorder 334

13.3.6 Special Populations and Differential Item Functioning 335

13.3.6.1 Perinatal 335

13.3.6.2 Emergency Medicine 336

13.3.6.3 Latinos Taking Tests in Spanish 336

13.3.6.4 Criminal Justice 338

13.3.7 Intensive LongitudinalData 339

13.4 IRT inMachine Learning 340

Bibliography 343

Index 361

Authors

R. Darrell Bock Robert D. Gibbons University of Illinois at Chicago.

Table of Contents

Authors

Related Topics

Related Products

Introductory Statistics, International Adaptation. Edition No. 10

Elementary Differential Equations and Boundary Value Problems, International Adaptation. Edition No. 12

Research & Reviews: Discrete Mathematical Structures

Fuzzy Mathematics, Graphs, and Similarity Measures. Analysis and Application Across Global Challenges

Mechanism Design, Behavioral Science and Artificial Intelligence in International Relations