+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

Administrative Records for Survey Methodology. Edition No. 1. Wiley Series in Survey Methodology

  • Book

  • 384 Pages
  • June 2021
  • John Wiley and Sons Ltd
  • ID: 5840615
ADMINISTRATIVE RECORDS FOR SURVEY METHODOLOGY

Addresses the international use of administrative records for large-scale surveys, censuses, and other statistical purposes

Administrative Records for Survey Methodology is a comprehensive guide to improving the quality, cost-efficiency, and interpretability of surveys and censuses using administrative data research. Contributions from a team of internationally-recognized experts provide practical approaches for integrating administrative data in statistical surveys, and discuss the methodological issues - including concerns of privacy, confidentiality, and legality - involved in collecting and analyzing administrative records. Numerous real-world examples highlight technological and statistical innovations, helping readers gain a better understanding of both fundamental methods and advanced techniques for controlling data quality reducing total survey error.

Divided into four sections, the first describes the basics of administrative records research and addresses disclosure limitation and confidentiality protection in linked data. Section two focuses on data quality and linking methodology, covering topics such as quality evaluation, measuring and controlling for non-consent bias, and cleaning and using administrative lists. The third section examines the use of administrative records in surveys and includes case studies of the Swedish register-based census and the administrative records applications used for the US 2020 Census. The book’s final section discusses combining administrative and survey data to improve income measurement, enhancing health surveys with data linkage, and other uses of administrative data in evidence-based policymaking. This state-of-the-art resource:- Discusses important administrative data issues and suggests how administrative data can be integrated with more traditional surveys- Describes practical uses of administrative records for evidence-driven decisions in both public and private sectors- Emphasizes using interdisciplinary methodology and linking administrative records with other data sources- Explores techniques to leverage administrative data to improve the survey frame, reduce nonresponse follow-up, assess coverage error, measure linkage non-consent bias, and perform small area estimation.

Administrative Records for Survey Methodology is an indispensable reference and guide for statistical researchers and methodologists in academia, industry, and government, particularly census bureaus and national statistical offices, and an ideal supplemental text for undergraduate and graduate courses in data science, survey methodology, data collection, and data analysis methods.

Table of Contents

Preface xv

Acknowledgments xxi

List of Contributors xxiii

Part I Fundamentals of Administrative Records Research and Applications 1

1 On the Use of Proxy Variables in Combining Register and Survey Data 3
Li-Chun Zhang

1.1 Introduction 3

1.1.1 A Multisource Data Perspective 3

1.1.2 Concept of Proxy Variable 5

1.2 Instances of Proxy Variable 7

1.2.1 Representation 7

1.2.2 Measurement 10

1.3 Estimation Using Multiple Proxy Variables 12

1.3.1 Asymmetric Setting 13

1.3.2 Uncertainty Evaluation: A Case of Two-Way Data 15

1.3.3 Symmetric Setting 17

1.4 Summary 20

References 20

2 Disclosure Limitation and Confidentiality Protection in Linked Data 25
John M. Abowd, Ian M. Schmutte, and Lars Vilhuber

2.1 Introduction 25

2.2 Paradigms of Protection 27

2.2.1 Input Noise Infusion 29

2.2.2 Formal Privacy Models 30

2.3 Confidentiality Protection in Linked Data: Examples 32

2.3.1 HRS-SSA 32

2.3.1.1 Data Description 32

2.3.1.2 Linkages to Other Data 32

2.3.1.3 Disclosure Avoidance Methods 33

2.3.2 SIPP-SSA-IRS (SSB) 34

2.3.2.1 Data Description 34

2.3.2.2 Disclosure Avoidance Methods 35

2.3.2.3 Disclosure Avoidance Assessment 35

2.3.2.4 Analytical Validity Assessment 37

2.3.3 LEHD: Linked Establishment and Employee Records 38

2.3.3.1 Data Description 38

2.3.3.2 Disclosure Avoidance Methods 39

2.3.3.3 Disclosure Avoidance Assessment for QWI 41

2.3.3.4 Analytical Validity Assessment for QWI 42

2.4 Physical and Legal Protections 43

2.4.1 Statistical Data Enclaves 44

2.4.2 Remote Processing 46

2.4.3 Licensing 46

2.4.4 Disclosure Avoidance Methods 47

2.4.5 Data Silos 48

2.5 Conclusions 49

2.A.1 Other Abbreviations 51

2.A.2 Concepts 52

Acknowledgments 54

References 54

Part II Data Quality of Administrative Records and Linking Methodology 61

3 Evaluation of the Quality of Administrative Data Used in the Dutch Virtual Census 63
Piet Daas, Eric S. Nordholt, Martijn Tennekes, and Saskia Ossen

3.1 Introduction 63

3.2 Data Sources and Variables 64

3.3 Quality Framework 66

3.3.1 Source and Metadata Hyper Dimensions 66

3.3.2 Data Hyper Dimension 68

3.4 Quality Evaluation Results for the Dutch 2011 Census 69

3.4.1 Source and Metadata: Application of Checklist 69

3.4.2 Data Hyper Dimension: Completeness and Accuracy Results 72

3.4.2.1 Completeness Dimension 73

3.4.2.2 Accuracy Dimension 75

3.4.2.3 Visualizing with a Tableplot 78

3.4.3 Discussion of the Quality Findings 80

3.5 Summary 81

3.6 Practical Implications for Implementation with Surveys and Censuses 81

3.7 Exercises 82

References 82

4 Improving Input Data Quality in Register-Based Statistics: The Norwegian Experience 85
Coen Hendriks

4.1 Introduction 85

4.2 The Use of Administrative Sources in Statistics Norway 86

4.3 Managing Statistical Populations 89

4.4 Experiences from the First Norwegian Purely Register-Based Population and Housing Census of 2011 91

4.5 The Contact with the Owners of Administrative Registers Was Put into System 93

4.5.1 Agreements on Data Processing 93

4.5.2 Agreements of Cooperation on Data Quality in Administrative Data Systems 95

4.5.3 The Forums for Cooperation 96

4.6 Measuring and Documenting Input Data Quality 96

4.6.1 Quality Indicators 96

4.6.2 Operationalizing the Quality Checks 97

4.6.3 Quality Reports 99

4.6.4 The Approach is Being Adopted by the Owners of Administrative Data 99

4.7 Summary 100

4.8 Exercises 101

References 104

5 Cleaning and Using Administrative Lists: Enhanced Practices and Computational Algorithms for Record Linkage and Modeling/Editing/Imputation 105
William E. Winkler

5.1 Introductory Comments 105

5.1.1 Example 1 105

5.1.2 Example 2 106

5.1.3 Example 3 107

5.2 Edit/Imputation 108

5.2.1 Background 108

5.2.2 Fellegi-Holt Model 110

5.2.3 Imputation Generalizing Little-Rubin 110

5.2.4 Connecting Edit with Imputation 111

5.2.5 Achieving Extreme Computational Speed 112

5.3 Record Linkage 113

5.3.1 Fellegi-Sunter Model 113

5.3.2 Estimating Parameters 116

5.3.3 Estimating False Match Rates 118

5.3.3.1 The Data Files 118

5.3.4 Achieving Extreme Computational Speed 123

5.4 Models for Adjusting Statistical Analyses for Linkage Error 124

5.4.1 Scheuren-Winkler 124

5.4.2 Lahiri-Larsen 125

5.4.3 Chambers and Kim 127

5.4.4 Chipperfield, Bishop, and Campbell 128

5.4.4.1 Empirical Data 130

5.4.5 Goldstein, Harron, and Wade 132

5.4.6 Hof and Zwinderman 133

5.4.7 Tancredi and Liseo 133

5.5 Concluding Remarks 133

5.6 Issues and Some Related Questions 134

References 134

6 Assessing Uncertainty When Using Linked Administrative Records 139
Jerome P. Reiter

6.1 Introduction 139

6.2 General Sources of Uncertainty 140

6.2.1 Imperfect Matching 140

6.2.2 Incomplete Matching 141

6.3 Approaches to Accounting for Uncertainty 142

6.3.1 Modeling Matching Matrix as Parameter 143

6.3.2 Direct Modeling 146

6.3.3 Imputation of Entire Concatenated File 148

6.4 Concluding Remarks 149

6.4.1 Problems to Be Solved 149

6.4.2 Practical Implications 150

6.5 Exercises 150

Acknowledgment 151

References 151

7 Measuring and Controlling for Non-Consent Bias in Linked Survey and Administrative Data 155
Joseph W. Sakshaug

7.1 Introduction 155

7.1.1 What is Linkage Consent? Why is Linkage Consent Needed? 155

7.1.2 Linkage Consent Rates in Large-Scale Surveys 156

7.1.3 The Impact of Linkage Non-Consent Bias on Survey Inference 158

7.1.4 The Challenge of Measuring and Controlling for Linkage Non-Consent Bias 158

7.2 Strategies for Measuring Linkage Non-Consent Bias 159

7.2.1 Formulation of Linkage Non-Consent Bias 159

7.2.2 Modeling Non-Consent Using Survey Information 160

7.2.3 Analyzing Non-Consent Bias for Administrative Variables 162

7.3 Methods for Minimizing Non-Consent Bias at the Survey Design Stage 163

7.3.1 Optimizing Linkage Consent Rates 163

7.3.2 Placement of the Consent Request 163

7.3.3 Wording of the Consent Request 165

7.3.4 Active and Passive Consent Procedures 166

7.3.5 Linkage Consent in Panel Studies 167

7.4 Methods for Minimizing Non-Consent Bias at the Survey Analysis Stage 168

7.4.1 Controlling for Linkage Non-Consent Bias via Statistical Adjustment 169

7.4.2 Weighting Adjustments 169

7.4.3 Imputation 170

7.5 Summary 172

7.5.1 Key Points for Measuring Linkage Non-Consent Bias 172

7.5.2 Key Points for Controlling for Linkage Non-Consent Bias 172

7.6 Practical Implications for Implementation with Surveys and Censuses 173

7.7 Exercises 174

References 174

Part III Use of Administrative Records in Surveys 179

8 A Register-Based Census: The Swedish Experience 181
Martin Axelson, Anders Holmberg, Ingegerd Jansson, and Sara Westling

8.1 Introduction 181

8.2 Background 182

8.3 Census 2011 183

8.4 A Register-Based Census 185

8.4.1 Registers at Statistics Sweden 185

8.4.2 Facilitating a System of Registers 186

8.4.3 Introducing a Dwelling Identification Key 187

8.4.4 The Census Household and Dwelling Populations 188

8.5 Evaluation of the Census 190

8.5.1 Introduction 190

8.5.2 Evaluating Household Size and Type 192

8.5.2.1 Sampling Design 192

8.5.2.2 Data Collection 193

8.5.2.3 Reconciliation 194

8.5.2.4 Results 194

8.5.3 Evaluating Ownership 195

8.5.4 Lessons Learned 198

8.6 Impact on Population and Housing Statistics 199

8.7 Summary and Final Remarks 201

References 203

9 Administrative Records Applications for the 2020 Census 205
Vincent T. Mule Jr, and Andrew Keller

9.1 Introduction 205

9.2 Administrative Record Usage in the U.S. Census 206

9.3 Administrative Record Integration in 2020 Census Research 207

9.3.1 Administrative Record Usage Determinations 207

9.3.2 NRFU Design Incorporating Administrative Records 208

9.3.3 Administrative Records Sources and Data Preparation 210

9.3.4 Approach to Determine Administrative Record Vacant Addresses 212

9.3.5 Extension of Vacant Methodology to Nonexistent Cases 214

9.3.6 Approach to Determine Occupied Addresses 215

9.3.7 Other Aspects and Alternatives of Administrative Record Enumeration 217

9.4 Quality Assessment 219

9.4.1 Microlevel Evaluations of Quality 219

9.4.2 Macrolevel Evaluations of Quality 221

9.5 Other Applications of Administrative Record Usage 224

9.5.1 Register-Based Census 224

9.5.2 Supplement Traditional Enumeration with Adjustments for Estimated Error for Official Census Counts 224

9.5.3 Coverage Evaluation 225

9.6 Summary 226

9.7 Exercises 227

References 228

10 Use of Administrative Records in Small Area Estimation 231
Andreea L. Erciulescu, Carolina Franco, and Partha Lahiri

10.1 Introduction 231

10.2 Data Preparation 233

10.3 Small Area Estimation Models for Combining Information 238

10.3.1 Area-level Models 238

10.3.2 Unit-level Models 247

10.4 An Application 252

10.5 Concluding Remarks 259

10.6 Exercises 259

Acknowledgments 261

References 261

Part IV Use of Administrative Data in Evidence-Based Policymaking 269

11 Enhancement of Health Surveys with Data Linkage 271
Cordell Golden and Lisa B. Mirel

11.1 Introduction 271

11.1.1 The National Center for Health Statistics (NCHS) 271

11.1.2 The NCHS Data Linkage Program 272

11.1.3 Initial Linkages with NCHS Surveys 272

11.2 Examples of NCHS Health Surveys that Were Enhanced Through Linkage 273

11.2.1 National Health Interview Survey (NHIS) 273

11.2.2 National Health and Nutrition Examination Survey (NHANES) 274

11.2.3 National Health Care Surveys 274

11.3 NCHS Health Surveys Linked with Vital Records and Administrative Data 275

11.3.1 National Death Index (NDI) 276

11.3.2 Centers for Medicare and Medicaid Services (CMS) 276

11.3.3 Social Security Administration (SSA) 277

11.3.4 Department of Housing and Urban Development (HUD) 277

11.3.5 United States Renal Data System and the Florida Cancer Data System 278

11.4 NCHS Data Linkage Program: Linkage Methodology and Processing Issues 278

11.4.1 Informed Consent in Health Surveys 278

11.4.2 Informed Consent for Child Survey Participants 279

11.4.3 Adaptive Approaches to Linking Health Surveys with Administrative Data 280

11.4.4 Use of Alternate Records 281

11.4.5 Protecting the Privacy of Health Survey Participants and Maintaining Data Confidentiality 282

11.4.6 Updates Over Time 283

11.5 Enhancements to Health Survey Data Through Linkage 284

11.6 Analytic Considerations and Limitations of Administrative Data 286

11.6.1 Adjusting Sample Weights for Linkage-Eligibility 287

11.6.2 Residential Mobility and Linkages to State Programs and Registries 288

11.7 Future of the NCHS Data Linkage Program 289

11.8 Exercises 291

Acknowledgments 292

Disclaimer 292

References 292

12 Combining Administrative and Survey Data to Improve Income Measurement 297
Bruce D. Meyer and Nikolas Mittag

12.1 Introduction 297

12.2 Measuring and Decomposing Total Survey Error 299

12.3 Generalized Coverage Error 302

12.4 Item Nonresponse and Imputation Error 305

12.5 Measurement Error 307

12.6 Illustration: Using Data Linkage to Better Measure Income and Poverty 311

12.7 Accuracy of Links and the Administrative Data 312

12.8 Conclusions 315

12.9 Exercises 316

Acknowledgments 317

References 317

13 Combining Data from Multiple Sources to Define a Respondent: The Case of Education Data 323
Peter Siegel, Darryl Creel, and James Chromy

13.1 Introduction 323

13.1.1 Options for Defining a Unit Respondent When Data Exist from Sources Instead of or in Addition to an Interview 324

13.1.2 Concerns with Defining a Unit Respondent Without Having an Interview 325

13.2 Literature Review 326

13.3 Methodology 327

13.3.1 Computing Weights for Interview Respondents and for Unit Respondents Who May Not Have Interview Data (Usable Case Respondents) 327

13.3.1.1 How Many Weights Are Necessary? 328

13.3.2 Imputing Data When All or Some Interview Data Are Missing 328

13.3.3 Conducting Nonresponse Bias Analyses to Appropriately Consider Interview and Study Nonresponse 329

13.4 Example of Defining a Unit Respondent for the National Postsecondary Student Aid Study (NPSAS) 330

13.4.1 Overview of NPSAS 330

13.4.2 Usable Case Respondent Approach 333

13.4.2.1 Results 333

13.4.3 Interview Respondent Approach 335

13.4.3.1 Results 336

13.4.4 Comparison of Estimates, Variances, and Nonresponse Bias Using Two Approaches to Define a Unit Respondent 338

13.5 Discussion: Advantages and Disadvantages of Two Approaches to Defining a Unit Respondent 340

13.5.1 Interview Respondents 340

13.5.2 Usable Case Respondents 341

13.6 Practical Implications for Implementation with Surveys and Censuses 342

13.A Appendix 343

13.A.1 NPSAS:08 Study Respondent Definition 343

13.B Appendix 343

References 348

Index 349

Authors

Asaph Young Chun Statistics Research Institute, Korea. Michael D. Larsen Saint Michael's College, United States. Gabriele Durrant Southampton University, UK. Jerome P. Reiter Duke University, United States.