+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

Sampling and Estimation from Finite Populations. Edition No. 1. Wiley Series in Survey Methodology

  • Book

  • 448 Pages
  • February 2020
  • John Wiley and Sons Ltd
  • ID: 5840380

A much-needed reference on survey sampling and its applications that presents the latest advances in the field

Seeking to show that sampling theory is a living discipline with a very broad scope, this book examines the modern development of the theory of survey sampling and the foundations of survey sampling. It offers readers a critical approach to the subject and discusses putting theory into practice. It also explores the treatment of non-sampling errors featuring a range of topics from the problems of coverage to the treatment of non-response. In addition, the book includes real examples, applications, and a large set of exercises with solutions.

Sampling and Estimation from Finite Populations begins with a look at the history of survey sampling. It then offers chapters on: population, sample, and estimation; simple and systematic designs; stratification; sampling with unequal probabilities; balanced sampling; cluster and two-stage sampling; and other topics on sampling, such as spatial sampling, coordination in repeated surveys, and multiple survey frames. The book also includes sections on: post-stratification and calibration on marginal totals; calibration estimation; estimation of complex parameters; variance estimation by linearization; and much more.

  • Provides an up-to-date review of the theory of sampling
  • Discusses the foundation of inference in survey sampling, in particular, the model-based and design-based frameworks
  • Reviews the problems of application of the theory into practice
  • Also deals with the treatment of non sampling errors

Sampling and Estimation from Finite Populations is an excellent book for methodologists and researchers in survey agencies and advanced undergraduate and graduate students in social science, statistics, and survey courses.

Table of Contents

List of Figures xiii

List of Tables xvii

List of Algorithms xix

Preface xxi

Preface to the First French Edition xxiii

Table of Notations xxv

1 A History of Ideas in Survey Sampling Theory 1

1.1 Introduction 1

1.2 Enumerative Statistics During the 19th Century 2

1.3 Controversy on the use of Partial Data 4

1.4 Development of a Survey Sampling Theory 5

1.5 The US Elections of 1936 6

1.6 The Statistical Theory of Survey Sampling 6

1.7 Modeling the Population 8

1.8 Attempt to a Synthesis 9

1.9 Auxiliary Information 9

1.10 Recent References and Development 10

2 Population, Sample, and Estimation 13

2.1 Population 13

2.2 Sample 14

2.3 Inclusion Probabilities 15

2.4 Parameter Estimation 17

2.5 Estimation of a Total 18

2.6 Estimation of a Mean 19

2.7 Variance of the Total Estimator 20

2.8 Sampling with Replacement 22

Exercises 24

3 Simple and Systematic Designs 27

3.1 Simple Random Sampling without Replacement with Fixed Sample Size 27

3.1.1 Sampling Design and Inclusion Probabilities 27

3.1.2 The Expansion Estimator and its Variance 28

3.1.3 Comment on the Variance-Covariance Matrix 31

3.2 Bernoulli Sampling 32

3.2.1 Sampling Design and Inclusion Probabilities 32

3.2.2 Estimation 34

3.3 Simple Random Sampling with Replacement 36

3.4 Comparison of the Designs with and Without Replacement 38

3.5 Sampling with Replacement and Retaining Distinct Units 38

3.5.1 Sample Size and Sampling Design 38

3.5.2 Inclusion Probabilities and Estimation 41

3.5.3 Comparison of the Estimators 44

3.6 Inverse Sampling with Replacement 45

3.7 Estimation of Other Functions of Interest 47

3.7.1 Estimation of a Count or a Proportion 47

3.7.2 Estimation of a Ratio 48

3.8 Determination of the Sample Size 50

3.9 Implementation of Simple Random Sampling Designs 51

3.9.1 Objectives and Principles 51

3.9.2 Bernoulli Sampling 51

3.9.3 Successive Drawing of the Units 52

3.9.4 Random Sorting Method 52

3.9.5 Selection-Rejection Method 53

3.9.6 The Reservoir Method 54

3.9.7 Implementation of Simple Random Sampling with Replacement 56

3.10 Systematic Sampling with Equal Probabilities 57

3.11 Entropy for Simple and Systematic Designs 58

3.11.1 Bernoulli Designs and Entropy 58

3.11.2 Entropy and Simple Random Sampling 60

3.11.3 General Remarks 61

Exercises 61

4 Stratification 65

4.1 Population and Strata 65

4.2 Sample, Inclusion Probabilities, and Estimation 66

4.3 Simple Stratified Designs 68

4.4 Stratified Design with Proportional Allocation 70

4.5 Optimal Stratified Design for the Total 71

4.6 Notes About Optimality in Stratification 74

4.7 Power Allocation 75

4.8 Optimality and Cost 76

4.9 Smallest Sample Size 76

4.10 Construction of the Strata 77

4.10.1 General Comments 77

4.10.2 Dividing a Quantitative Variable in Strata 77

4.11 Stratification Under Many Objectives 79

Exercises 80

5 Sampling with Unequal Probabilities 83

5.1 Auxiliary Variables and Inclusion Probabilities 83

5.2 Calculation of the Inclusion Probabilities 84

5.3 General Remarks 85

5.4 Sampling with Replacement with Unequal Inclusion Probabilities 86

5.5 Nonvalidity of the Generalization of the Successive Drawing without Replacement 88

5.6 Systematic Sampling with Unequal Probabilities 89

5.7 Deville’s Systematic Sampling 91

5.8 Poisson Sampling 92

5.9 Maximum Entropy Design 95

5.10 Rao-Sampford Rejective Procedure 98

5.11 Order Sampling 100

5.12 Splitting Method 101

5.12.1 General Principles 101

5.12.2 Minimum Support Design 103

5.12.3 Decomposition into Simple Random Sampling Designs 104

5.12.4 Pivotal Method 107

5.12.5 Brewer Method 109

5.13 Choice of Method 110

5.14 Variance Approximation 111

5.15 Variance Estimation 114

Exercises 115

6 Balanced Sampling 119

6.1 Introduction 119

6.2 Balanced Sampling: Definition 120

6.3 Balanced Sampling and Linear Programming 122

6.4 Balanced Sampling by Systematic Sampling 123

6.5 Methode of Deville, Grosbras, and Roth 124

6.6 Cube Method 125

6.6.1 Representation of a Sampling Design in the form of a Cube 125

6.6.2 Constraint Subspace 126

6.6.3 Representation of the Rounding Problem 127

6.6.4 Principle of the Cube Method 130

6.6.5 The Flight Phase 130

6.6.6 Landing Phase by Linear Programming 133

6.6.7 Choice of the Cost Function 134

6.6.8 Landing Phase by Relaxing Variables 135

6.6.9 Quality of Balancing 135

6.6.10 An Example 136

6.7 Variance Approximation 137

6.8 Variance Estimation 140

6.9 Special Cases of Balanced Sampling 141

6.10 Practical Aspects of Balanced Sampling 141

Exercise 142

7 Cluster and Two-stage Sampling 143

7.1 Cluster Sampling 143

7.1.1 Notation and Definitions 143

7.1.2 Cluster Sampling with Equal Probabilities 146

7.1.3 Sampling Proportional to Size 147

7.2 Two-stage Sampling 148

7.2.1 Population, Primary, and Secondary Units 149

7.2.2 The Expansion Estimator and its Variance 151

7.2.3 Sampling with Equal Probability 155

7.2.4 Self-weighting Two-stage Design 156

7.3 Multi-stage Designs 157

7.4 Selecting Primary Units with Replacement 158

7.5 Two-phase Designs 161

7.5.1 Design and Estimation 161

7.5.2 Variance and Variance Estimation 162

7.6 Intersection of Two Independent Samples 163

Exercises 165

8 Other Topics on Sampling 167

8.1 Spatial Sampling 167

8.1.1 The Problem 167

8.1.2 Generalized Random Tessellation Stratified Sampling 167

8.1.3 Using the Traveling Salesman Method 169

8.1.4 The Local Pivotal Method 169

8.1.5 The Local Cube Method 169

8.1.6 Measures of Spread 170

8.2 Coordination in Repeated Surveys 172

8.2.1 The Problem 172

8.2.2 Population, Sample, and Sample Design 173

8.2.3 Sample Coordination and Response Burden 174

8.2.4 Poisson Method with Permanent Random Numbers 175

8.2.5 Kish and Scott Method for Stratified Samples 176

8.2.6 The Cotton and Hesse Method 176

8.2.7 The Rivière Method 177

8.2.8 The Netherlands Method 178

8.2.9 The Swiss Method 178

8.2.10 Coordinating Unequal Probability Designs with Fixed Size 181

8.2.11 Remarks 181

8.3 Multiple Survey Frames 182

8.3.1 Introduction 182

8.3.2 Calculating Inclusion Probabilities 183

8.3.3 Using Inclusion Probability Sums 184

8.3.4 Using a Multiplicity Variable 185

8.3.5 Using a Weighted Multiplicity Variable 186

8.3.6 Remarks 187

8.4 Indirect Sampling 187

8.4.1 Introduction 187

8.4.2 Adaptive Sampling 188

8.4.3 Snowball Sampling 188

8.4.4 Indirect Sampling 189

8.4.5 The Generalized Weight Sharing Method 190

8.5 Capture-Recapture 191

9 Estimation with a Quantitative Auxiliary Variable 195

9.1 The Problem 195

9.2 Ratio Estimator 196

9.2.1 Motivation and Definition 196

9.2.2 Approximate Bias of the Ratio Estimator 197

9.2.3 Approximate Variance of the Ratio Estimator 198

9.2.4 Bias Ratio 199

9.2.5 Ratio and Stratified Designs 199

9.3 The Difference Estimator 201

9.4 Estimation by Regression 202

9.5 The Optimal Regression Estimator 204

9.6 Discussion of the Three Estimation Methods 205

Exercises 208

10 Post-Stratification and Calibration on Marginal Totals 209

10.1 Introduction 209

10.2 Post-Stratification 209

10.2.1 Notation and Definitions 209

10.2.2 Post-Stratified Estimator 211

10.3 The Post-Stratified Estimator in Simple Designs 212

10.3.1 Estimator 212

10.3.2 Conditioning in a Simple Design 213

10.3.3 Properties of the Estimator in a Simple Design 214

10.4 Estimation by Calibration on Marginal Totals 217

10.4.1 The Problem 217

10.4.2 Calibration on Marginal Totals 218

10.4.3 Calibration and Kullback-Leibler Divergence 220

10.4.4 Raking Ratio Estimation 221

10.5 Example 221

Exercises 224

11 Multiple Regression Estimation 225

11.1 Introduction 225

11.2 Multiple Regression Estimator 226

11.3 Alternative Forms of the Estimator 227

11.3.1 Homogeneous Linear Estimator 227

11.3.2 Projective Form 228

11.3.3 Cosmetic Form 228

11.4 Calibration of the Multiple Regression Estimator 229

11.5 Variance of the Multiple Regression Estimator 230

11.6 Choice of Weights 231

11.7 Special Cases 231

11.7.1 Ratio Estimator 231

11.7.2 Post-stratified Estimator 231

11.7.3 Regression Estimation with a Single Explanatory Variable 233

11.7.4 Optimal Regression Estimator 233

11.7.5 Conditional Estimation 235

11.8 Extension to Regression Estimation 236

Exercise 236

12 Calibration Estimation 237

12.1 Calibrated Methods 237

12.2 Distances and Calibration Functions 239

12.2.1 The Linear Method 239

12.2.2 The Raking Ratio Method 240

12.2.3 Pseudo Empirical Likelihood 242

12.2.4 Reverse Information 244

12.2.5 The Truncated Linear Method 245

12.2.6 General Pseudo-Distance 246

12.2.7 The Logistic Method 249

12.2.8 Deville Calibration Function 249

12.2.9 Roy and Vanheuverzwyn Method 251

12.3 Solving Calibration Equations 252

12.3.1 Solving by Newton’s Method 252

12.3.2 Bound Management 253

12.3.3 Improper Calibration Functions 254

12.3.4 Existence of a Solution 254

12.4 Calibrating on Households and Individuals 255

12.5 Generalized Calibration 256

12.5.1 Calibration Equations 256

12.5.2 Linear Calibration Functions 257

12.6 Calibration in Practice 258

12.7 An Example 259

Exercises 260

13 Model-Based approach 263

13.1 Model Approach 263

13.2 The Model 263

13.3 Homoscedastic Constant Model 267

13.4 Heteroscedastic Model 1 Without Intercept 267

13.5 Heteroscedastic Model 2 Without Intercept 269

13.6 Univariate Homoscedastic Linear Model 270

13.7 Stratified Population 271

13.8 Simplified Versions of the Optimal Estimator 273

13.9 Completed Heteroscedasticity Model 276

13.10 Discussion 277

13.11 An Approach that is Both Model- and Design-based 277

14 Estimation of Complex Parameters 281

14.1 Estimation of a Function of Totals 281

14.2 Variance Estimation 282

14.3 Covariance Estimation 283

14.4 Implicit Function Estimation 283

14.5 Cumulative Distribution Function and Quantiles 284

14.5.1 Cumulative Distribution Function Estimation 284

14.5.2 Quantile Estimation: Method 1 284

14.5.3 Quantile Estimation: Method 2 285

14.5.4 Quantile Estimation: Method 3 287

14.5.5 Quantile Estimation: Method 4 288

14.6 Cumulative Income, Lorenz Curve, and Quintile Share Ratio 288

14.6.1 Cumulative Income Estimation 288

14.6.2 Lorenz Curve Estimation 289

14.6.3 Quintile Share Ratio Estimation 289

14.7 Gini Index 290

14.8 An Example 291

15 Variance Estimation by Linearization 295

15.1 Introduction 295

15.2 Orders of Magnitude in Probability 295

15.3 Asymptotic Hypotheses 300

15.3.1 Linearizing a Function of Totals 301

15.3.2 Variance Estimation 303

15.4 Linearization of Functions of Interest 303

15.4.1 Linearization of a Ratio 303

15.4.2 Linearization of a Ratio Estimator 304

15.4.3 Linearization of a Geometric Mean 305

15.4.4 Linearization of a Variance 305

15.4.5 Linearization of a Covariance 306

15.4.6 Linearization of a Vector of Regression Coefficients 307

15.5 Linearization by Steps 308

15.5.1 Decomposition of Linearization by Steps 308

15.5.2 Linearization of a Regression Coefficient 308

15.5.3 Linearization of a Univariate Regression Estimator 309

15.5.4 Linearization of a Multiple Regression Estimator 309

15.6 Linearization of an Implicit Function of Interest 310

15.6.1 Estimating Equation and Implicit Function of Interest 310

15.6.2 Linearization of a Logistic Regression Coefficient 311

15.6.3 Linearization of a Calibration Equation Parameter 313

15.6.4 Linearization of a Calibrated Estimator 313

15.7 Influence Function Approach 314

15.7.1 Function of Interest, Functional 314

15.7.2 Definition 315

15.7.3 Linearization of a Total 316

15.7.4 Linearization of a Function of Totals 316

15.7.5 Linearization of Sums and Products 317

15.7.6 Linearization by Steps 318

15.7.7 Linearization of a Parameter Defined by an Implicit Function 318

15.7.8 Linearization of a Double Sum 319

15.8 Binder’s Cookbook Approach 321

15.9 Demnati and Rao Approach 322

15.10 Linearization by the Sample Indicator Variables 324

15.10.1 The Method 324

15.10.2 Linearization of a Quantile 326

15.10.3 Linearization of a Calibrated Estimator 327

15.10.4 Linearization of a Multiple Regression Estimator 328

15.10.5 Linearization of an Estimator of a Complex Function with Calibrated Weights 329

15.10.6 Linearization of the Gini Index 330

15.11 Discussion on Variance Estimation 331

Exercises 331

16 Treatment of Nonresponse 333

16.1 Sources of Error 333

16.2 Coverage Errors 334

16.3 Different Types of Nonresponse 334

16.4 Nonresponse Modeling 335

16.5 Treating Nonresponse by Reweighting 336

16.5.1 Nonresponse Coming from a Sample 336

16.5.2 Modeling the Nonresponse Mechanism 337

16.5.3 Direct Calibration of Nonresponse 339

16.5.4 Reweighting by Generalized Calibration 341

16.6 Imputation 342

16.6.1 General Principles 342

16.6.2 Imputing From an Existing Value 342

16.6.3 Imputation by Prediction 342

16.6.4 Link Between Regression Imputation and Reweighting 343

16.6.5 Random Imputation 345

16.7 Variance Estimation with Nonresponse 347

16.7.1 General Principles 347

16.7.2 Estimation by Direct Calibration 348

16.7.3 General Case 349

16.7.4 Variance for Maximum Likelihood Estimation 350

16.7.5 Variance for Estimation by Calibration 353

16.7.6 Variance of an Estimator Imputed by Regression 356

16.7.7 Other Variance Estimation Techniques 357

17 Summary Solutions to the Exercises 359

Bibliography 379

Author Index 405

Subject Index 411

Authors

Yves Tille Universite de Neuchatel.