A practical introduction to epidemiology, biostatistics, and research methodology for the whole health care community
This comprehensive text, which has been extensively revised with new material and additional topics, utilizes a practical slant to introduce health professionals and students to epidemiology, biostatistics, and research methodology. It draws examples from a wide range of topics, covering all of the main contemporary health research methods, including survival analysis, Cox regression, and systematic reviews and meta-analysis - the explanation of which go beyond introductory concepts. This second edition of Quantitative Methods for Health Research: A Practical Interactive Guide to Epidemiology and Statistics also helps develop critical skills that will prepare students to move on to more advanced and specialized methods.
A clear distinction is made between knowledge and concepts that all students should ensure they understand, and those that can be pursued further by those who wish to do so. Self-assessment exercises throughout the text help students explore and reflect on their understanding. A program of practical exercises in SPSS (using a prepared data set) helps to consolidate the theory and develop skills and confidence in data handling, analysis, and interpretation. Highlights of the book include:
- Combining epidemiology and bio-statistics to demonstrate the relevance and strength of statistical methods
- Emphasis on the interpretation of statistics using examples from a variety of public health and health care situations to stress relevance and application
- Use of concepts related to examples of published research to show the application of methods and balance between ideals and the realities of research in practice
- Integration of practical data analysis exercises to develop skills and confidence
- Supplementation by a student companion website which provides guidance on data handling in SPSS and study data sets as referred to in the text
Quantitative Methods for Health Research, Second Edition is a practical learning resource for students, practitioners and researchers in public health, health care and related disciplines, providing both a course book and a useful introductory reference.
Table of Contents
Contents
Preface xv
About the Companion Website xxi
1 Philosophy of Science and Introduction to Epidemiology 1
Introduction and Learning Objectives 1
1.1 Approaches to Scientific Research 2
1.1.1 History and Nature of Scientific Research 2
1.1.2 What is Epidemiology? 6
1.1.3 What are Statistics? 7
1.1.4 Approach to Learning 8
1.2 Formulating a Research Question 8
1.2.1 Importance of a Well-Defined Research Question 8
1.2.2 Development of Research Ideas 10
1.3 Rates: Incidence and Prevalence 11
1.3.1 Why Do We Need Rates? 11
1.3.2 Measures of Disease Frequency 12
1.3.3 Prevalence Rate 12
1.3.4 Incidence Rate 12
1.3.5 Relationship Between Incidence, Duration, and Prevalence 15
1.4 Concepts of Prevention 16
1.4.1 Introduction 16
1.4.2 Primary, Secondary, and Tertiary Prevention 17
1.5 Answers to Self-Assessment Exercises 18
2 Routine Data Sources and Descriptive Epidemiology 25
Introduction and Learning Objectives 25
2.1 Routine Collection of Health Information 26
2.1.1 Deaths (Mortality) 26
2.1.2 Compiling Mortality Statistics: The Example of England and Wales 28
2.1.3 Suicide Among Men 29
2.1.4 Suicide Among Young Women 31
2.1.5 Variations in Deaths of Very Young Children 31
2.2 Descriptive Epidemiology 33
2.2.1 What is Descriptive Epidemiology? 33
2.2.2 International Variations in Rates of Lung Cancer 33
2.2.3 Illness (Morbidity) 34
2.2.4 Sources of Information on Morbidity 35
2.2.5 Notification of Infectious Disease 35
2.2.6 Illness Seen in General Practice 38
2.3 Information on the Environment 39
2.3.1 Air Pollution and Health 39
2.3.2 Routinely Available Data on Air Pollution 39
2.4 Displaying, Describing, and Presenting Data 41
2.4.1 Displaying the Data 41
2.4.2 Calculating the Frequency Distribution 42
2.4.3 Describing the Frequency Distribution 44
2.4.4 The Relative Frequency Distribution 57
2.4.5 Scatterplots, Linear Relationships and Correlation 60
2.5 Routinely Available Health Data 69
2.5.1 Introduction 69
2.5.2 Classification of Routine Health Information Sources 69
2.5.3 Demographic Data 71
2.5.4 Health Event Data 73
2.5.5 Population-Based Health Information 78
2.5.6 Deprivation Indices 79
2.5.7 Routine Data Sources for Countries Other Than the UK Descriptive Epidemiology in Action 80 80
2.6.1 The London Smogs of the 1950s 80
2.6.2 Ecological Studies 82
2.7 Overview of Epidemiological Study Designs
2.8 Answers to Self-Assessment Exercises 84 86
3 Standardisation 101
Introduction and Learning Objectives 101
3.1 Health Inequalities in Merseyside 101
3.1.1 Socio-Economic Conditions and Health 101
3.1.2 Comparison of Crude Death Rates 102
3.1.3 Usefulness of a Summary Measure 104
3.2 Indirect Standardisation: Calculation of the Standardised Mortality Ratio (SMR) 105
3.2.1 Mortality in Liverpool 105
3.2.2 Interpretation of the SMR 107
3.2.3 Dealing With Random Variation: The 95 per cent Confidence Interval 107
3.2.4 Increasing Precision of the SMR Estimate 108
3.2.5 Mortality in Sefton 108
3.2.6 Comparison of SMRs 110
3.2.7 Indirectly Standardised Mortality Rates 110
3.3 Direct Standardisation 110
3.3.1 Introduction 110
3.3.2 An Example: Changes in Deaths From Stroke Over Time 111
3.3.3 Using the European Standard Population 112
3.3.4 Direct or Indirect: Which Method is Best? 113
3.4 Standardisation for Factors Other Than Age
3.5 Answers to Self-Assessment Exercises 114
4 Surveys 123
Introduction and Learning Objectives 123
Resource Papers 124
4.1 Purpose and Context 124
4.1.1 Defining the Research Question 124
4.1.2 Political Context of Research 126
4.2 Sampling Methods 127
4.2.1 Introduction 127
4.2.2 Sampling 127
4.2.3 Probability 129
4.2.4 Simple Random Sampling 130
4.2.5 Stratified Sampling 131
4.2.6 Cluster Random Sampling 132
4.2.7 Multistage Random Sampling 133
4.2.8 Systematic Sampling 133
4.2.9 Convenience Sampling 133
4.2.10 Sampling People Who are Difficult to Contact 133
4.2.11 Quota Sampling 134
4.2.12 Sampling in Natsal-3 135
4.3 The Sampling Frame 137
4.3.1 Why Do We Need a Sampling Frame? 137
4.3.2 Losses in Sampling 137
4.4 Sampling Error, Confidence Intervals, and Sample Size 139
4.4.1 Sampling Distributions and the Standard Error 139
4.4.2 The Standard Error 140
4.4.3 Key Properties of the Normal Distribution 145
4.4.4 Confidence Interval (CI) for the Sample Mean 146
4.4.5 Estimating Sample Size 149
4.4.6 Sample Size for Estimating a Population Mean 149
4.4.7 Standard Error and 95 per cent CI for a Population Proportion 150
4.4.8 Sample Size to Estimate a Population Proportion 151
4.5 Response 153
4.5.1 Determining the Response Rate 153
4.5.2 Assessing Whether the Sample is Representative 154
4.5.3 Maximising the Response Rate 154
4.6 Measurement 157
4.6.1 Introduction: The Importance of Good Measurement 157
4.6.2 Interview or Self-Completed Questionnaire? 157
4.6.3 Principles of Good Questionnaire Design 158
4.6.4 Development of a Questionnaire 161
4.6.5 Checking How Well the Interviews and Questionnaires Have Worked 161
4.6.6 Assessing Measurement Quality 165
4.6.7 Overview of Sources of Error 169
4.7 Data Types and Presentation 171
4.7.1 Introduction 171
4.7.2 Types of Data 172
4.7.3 Displaying and Summarising the Data 173
4.8 Answers to Self-Assessment Exercises 176
5 Cohort Studies 185
Introduction and Learning Objectives 185
Resource Papers 186
5.1 Why Do a Cohort Study? 186
5.1.1 Objectives of the Study 186
5.1.2 Study Structure 188
5.2 Obtaining the Sample 188
5.2.1 Introduction 188
5.2.2 Sample Size 190
5.3 Measurement 190
5.3.1 Importance of Good Measurement 190
5.3.2 Identifying and Avoiding Measurement Error 190
5.3.3 The Measurement of Blood Pressure 191
5.3.4 Case Definition 192
5.4 Follow-Up 193
5.4.1 Nature of the Task 193
5.4.2 Deaths (Mortality) 193
5.4.3 Non-Fatal Cases (Morbidity) 194
5.4.4 Challenges Faced with Follow-Up of a Cohort in a Different Setting 194
5.4.5 Assessment of Changes During Follow-Up Period 196
5.5 Basic Presentation and Analysis of Results 198
5.5.1 Initial Presentation of Findings 198
5.5.2 Relative Risk 199
5.5.3 Hypothesis Test for Categorical Data: The Chi-Squared Test 201
5.5.4 Hypothesis Tests for Continuous Data: The z-Test and the t-Test 209
5.6 How Large Should a Cohort Study Be? 214
5.6.1 Perils of Inadequate Sample Size 214
5.6.2 Sample Size for a Cohort Study 215
5.6.3 Example of Output from Sample Size Calculation 216
5.7 Assessing Whether an Association is Causal 218
5.7.1 The Hill Viewpoints 218
5.7.2 Confounding: What Is It and How Can It Be Addressed? 220
5.7.3 Does Smoking Cause Heart Disease? 222
5.7.4 Confounding in the Physical Activity and Cancer Study 222
5.7.5 Methods for Dealing with Confounding 224
5.8 Simple Linear Regression 224
5.8.1 Approaches to Describing Associations 224
5.8.2 Finding the Best Fit for a Straight Line 226
5.8.3 Interpreting the Regression Line 227
5.8.4 Using the Regression Line 228
5.8.5 Hypothesis Test of the Association Between the Explanatory and
Outcome Variables 228
5.8.6 How Good is the Regression Model? 229
5.8.7 Interpreting SPSS Output for Simple Linear Regression Analysis 231
5.8.8 First Table: Variables Entered/Removed 232
5.9 Introduction to Multiple Linear Regression 235
5.9.1 Principles of Multiple Regression 235
5.9.2 Using Multivariable Linear Regression to Study Independent Associations 235
5.9.3 Investigation of the Effect of Work Stress on Bodyweight 235
5.9.4 Multiple Regression in the Cancer Study 239
5.9.5 Overview of Regression Methods for Different Types of Outcome 240
5.10 Answers to Self-Assessment Exercises 242
6 Case–Control Studies 251
Introduction and Learning Objectives 251
Resource Papers 252
6.1 Why do a Case–Control Study? 253
6.1.1 Study Objectives 253
6.1.2 Study Structure 254
6.1.3 Approach to Analysis 255
6.1.4 Retrospective Data Collection 257
6.1.5 Applications of the Case–Control Design 258
6.2 Key Elements of Study Design 259
6.2.1 Selecting the Cases 259
6.2.2 The Controls 260
6.2.3 Exposure Assessment 262
6.2.4 Bias in Exposure Assessment 263
6.3 Basic Unmatched and Matched Analysis 265
6.3.1 The Odds Ratio (OR) 265
6.3.2 Calculation of the OR–Simple Matched Analysis 269
6.3.3 Hypothesis Tests for Case–Control Studies 271
6.4 Sample Size for a Case–Control Study 273
6.4.1 Introduction 273
6.4.2 What Information is Required? 273
6.4.3 An Example of Sample Size Calculation Using OpenEpi 274
6.5 Confounding and Logistic Regression 276
6.5.1 Introduction 276
6.5.2 Stratification 277
6.5.3 Logistic Regression 278
6.5.4 Example: Multivariable Logistic Regression 281
6.5.5 Matched Studies – Conditional Logistic Regression 287
6.5.6 Interpretation of Adjusted Results from the New Zealand Study 287
6.6 Answers to Self-Assessment Exercises 289
7 Intervention Studies 297
Introduction and Learning Objectives 297
Typology of Intervention Study Designs Described in This Chapter 297
Terminology 298
Resource Papers 299
7.1 Why Do an Intervention Study? 299
7.1.1 Study Objectives 299
7.1.2 Structure of a Randomised, Controlled Intervention Study 300
7.2 Key Elements of Intervention Study Design 303
7.2.1 Defining Who Should be Included and Excluded 303
7.2.2 Intervention and Control 304
7.2.3 Randomisation 306
7.2.4 Outcome Assessment 307
7.2.5 Blinding 308
7.2.6 Ethical Issues for Intervention Studies 308
7.3 The Analysis of Intervention Studies 309
7.3.1 Review of Variables at Baseline 310
7.3.2 Loss to Follow-Up 311
7.3.3 Compliance with the Treatment Allocation 311
7.3.4 Analysis by Intention-to-Treat 312
7.3.5 Analysis per Protocol 313
7.3.6 What is the Effect of the Intervention? 313
7.3.7 Drawing Conclusions 315
7.3.8 Adjustment for Variables Known to Influence the Outcome 315
7.3.9 Paired Comparisons 315
7.3.10 The Crossover Trial 317
7.4 Testing More-Complex Interventions 318
7.4.1 Introduction 318
7.4.2 Randomised Trial of Individuals for a Complex Intervention 319
7.4.3 Factorial Design 322
7.4.4 Analysis and Interpretation 323
7.4.5 Departure from the Ideal Blinded RCT Design 327
7.4.6 The Cluster Randomised Trial 328
7.4.7 The Community (Cluster) Randomised Trial 330
7.4.8 Non-Randomised Intervention Designs 332
7.4.9 The Natural Experiment 333
7.5 Analysis of Intervention Studies Using a Cluster Design 334
7.5.1 Why Does the Use of Clusters Make a Difference? 334
7.5.2 Summarising Clustering Effects: The Intra-Class Correlation Coefficient 334
7.5.3 Multi-Level Modelling 335
7.5.4 Analysis of the Cluster RCT of Physical Activity 335
7.6 How Big Should the Intervention Study Be? 337
7.6.1 Introduction 337
7.6.2 Sample Size for a Trial with Categorical Data Outcomes 337
7.6.3 One-Sided and Two-Sided Tests 339
7.6.4 Sample Size for a Trial with Continuous Data Outcomes 339
7.6.5 Sample Size for an Intervention Study Using Cluster Design 340
7.6.6 Estimation of Sample Size is not a Precise Science 341
7.7 Intervention Study Registration, Management, and Reporting 341
7.7.1 Introduction 341
7.7.2 Registration 342
7.7.3 Trial Management 342
7.7.4 Reporting Standards (CONSORT) 343
7.8 Answers to Self-Assessment Exercises 344
8 Life Tables, Survival Analysis, and Cox Regression 355
Introduction and Learning Objectives 355
Resource Papers 356
8.1 Survival Analysis 356
8.1.1 Introduction 356
8.1.2 Why Do We Need Survival Analysis? 356
8.1.3 Censoring 357
8.1.4 Kaplan–Meier Survival Curves 359
8.1.5 Kaplan–Meier Survival Curves 361
8.1.6 The Log-Rank Test 362
8.1.7 Interpretation of the Kaplan–Meier Survival Curve 365
8.2 Cox Regression 371
8.2.1 Introduction 371
8.2.2 The Hazard Function 371
8.2.3 Assumption of Proportional Hazards 372
8.2.4 The Cox Regression Model 372
8.2.5 Checking the Assumption of Proportional Hazards 372
8.2.6 Interpreting the Cox Regression Model 373
8.2.7 Prediction 374
8.2.8 Application of Cox Regression 375
8.3 Current Life Tables 377
8.3.1 Introduction 377
8.3.2 Current Life Tables and Life Expectancy at Birth 377
8.3.3 Life Expectancy at Other Ages 379
8.3.4 Healthy or Disability-Free Life Expectancy 379
8.3.5 Abridged Life Tables 380
8.3.6 Summary 381
8.4 Answers to Self-Assessment Exercises 381
9 Systematic Reviews and Meta-Analysis 385
Introduction and Learning Objectives 385
Increasing Power by Combining Studies 386
Resource Papers 387
9.1 The Why and How of Systematic Reviews 387
9.1.1 Why is it Important that Reviews be Systematic? 387
9.1.2 Method of Systematic Review – Overview and Developing a Protocol 388
9.1.3 Deciding on the Research Question and Objectives for the Review 389
9.1.4 Defining Criteria for Inclusion and Exclusion of Studies 390
9.1.5 Identifying Relevant Studies 391
9.1.6 Assessment of Methodological Quality 396
9.1.7 Extracting Data 399
9.1.8 Describing the Results 399
9.2 The Methodology of Meta-Analysis 402
9.2.1 Method of Meta-Analysis – Overview 402
9.2.2 Assessment of Publication Bias – the Funnel Plot 403
9.2.3 Heterogeneity 405
9.2.4 Calculating the Pooled Estimate 407
9.2.5 Presentation of Results: Forest Plot 408
9.2.6 Sensitivity Analysis 409
9.2.7 Statistical Software for the Conduct of Meta-Analysis 410
9.2.8 Another Example of the Value of Meta-Analysis – Identifying a Dangerous Treatment 411
9.3 Systematic Reviews and Meta-Analyses of Observational Studies 414
9.3.1 Introduction 414
9.3.2 Why Conduct a Systematic Review of Observational Studies? 414
9.3.3 Approach to Meta-Analysis of Observational Studies 415
9.3.4 Method of Systematic Review of Observational Studies 416
9.3.5 Method of Meta-Analysis of Observational Studies 416
9.4 Reporting and Publishing Systematic Reviews and Meta-Analyses 418
9.5 The Cochrane Collaboration 419
9.5.1 Introduction 419
9.5.2 Cochrane Collaboration Logo 422
Collaborative Review Groups 422
9.5.3 Cochrane Library 422
9.6 Answers to Self-Assessment Exercises 423
10 Prevention Strategies and Evaluation of Screening 429
Introduction and Learning Objectives 429
Resource Papers 430
10.1 Concepts of Risk 430
10.1.1 Relative and Attributable Risk 430
10.1.2 Calculation of AR 431
10.1.3 Attributable Fraction (AF) for a Dichotomous Exposure 432
10.1.4 Attributable Fraction for Continuous and Multiple Category Exposures 434
10.1.5 Years of Life Lost (YLL) and Years Lived with Disability (YLD) 434
10.1.6 Disability-Adjusted Life Years (DALYs) 436
10.1.7 Burden Attributable to Specific Risk Factors 438
10.2 Strategies of Prevention 440
10.2.1 The Distribution of Risk in Populations 440
10.2.2 High-Risk and Population Approaches to Prevention 443
10.2.3 Safety and the Population Strategy 446
10.2.4 The High-Risk and Population Strategies Revisited 447
10.2.5 Implications of Genomic Research for Disease Prevention 448
10.3 Evaluation of Screening Programmes 450
10.3.1 Purpose of Screening 451
10.3.2 Criteria for Programme Evaluation 451
10.3.3 Assessing Validity of a Screening Test 452
10.3.4 Methodological Issues in Studies of Screening Programme Effectiveness 460
10.3.5 Are the Wilson–Jungner Criteria Relevant Today? 461
10.4 Cohort and Period Effects 463
10.4.1 Analysis of Change in Risk Over Time 463
10.4.2 Example: Suicide Trends in UK Men and Women 464
10.5 Answers to Self-Assessment Exercises 468
11 Probability Distributions, Hypothesis Testing, and Bayesian Methods 477
Introduction and Learning Objectives 477
Resource Papers 478
11.1 Probability Distributions 478
11.1.1 Probability – A Brief Review 478
11.1.2 Introduction to Probability Distributions 479
11.1.3 Types of Probability Distribution 481
11.1.4 Probability Distributions: Implications for Statistical Methods 487
11.2 Data That Do Not Fit a Probability Distribution 488
11.2.1 Robustness of an Hypothesis Test 488
11.2.2 Transforming the Data 488
11.2.3 Principles of Non-Parametric Hypothesis Testing 492
11.3 Hypothesis Testing: Summary of Common Parametric and Non-Parametric Methods 493
11.3.1 Introduction 493
11.3.2 Review of Hypothesis Tests 494
11.3.3 Fundamentals of Hypothesis Testing 494
11.3.4 Summary: Stages of Hypothesis Testing 495
11.3.5 Comparing Two Independent Groups 496
11.3.6 Comparing Two Paired (or Matched) Groups 500
11.3.7 Testing for Association Between Two Groups 506
11.3.8 Comparing More Than Two Groups 508
11.3.9 Association Between Categorical Variables 513
11.4 Choosing an Appropriate Hypothesis Test 517
11.4.1 Introduction 517
11.4.2 Using a Guide Table for Selecting a Hypothesis Test 517
11.4.3 The Problem of Multiple Significance Testing 520
11.5 Bayesian Methods 520
11.5.1 Introduction: A Different Approach to Inference 520
11.5.2 Bayes’ Theorem and Formula 521
11.5.3 Application and Relevance 522
11.6 Answers to Self-Assessment Exercises 525
Bibliography 529
Index 533