Applied Medical Statistics. Edition No. 1


Book
640 Pages
April 2022
John Wiley and Sons Ltd
ID: 5840110

APPLIED MEDICAL STATISTICS

An up-to-date exploration of foundational concepts in statistics and probability for medical students and researchers

Medical journals and researchers are increasingly recognizing the need for improved statistical rigor in medical science. In Applied Medical Statistics, renowned statistician and researcher Dr. Jingmei Jiang delivers a clear, coherent, and accessible introduction to basic statistical concepts, ideal for medical students and medical research practitioners. The book will help readers master foundational concepts in statistical analysis and assist in the development of a critical understanding of the basic rationale of statistical analysis techniques.

The distinguished author presents information without assuming the reader has a background in specialized mathematics, statistics, or probability. All of the described methods are illustrated with up-to-date examples based on real-world medical research, supplemented by exercises and case discussions to help solidify the concepts and give readers an opportunity to critically evaluate different research scenarios.

Readers will also benefit from the inclusion of: - A thorough introduction to basic concepts in statistics, including foundational terms and definitions, location and spread of data distributions, population parameters estimation, and statistical hypothesis tests - Explorations of commonly used statistical methods, including t-tests,analysis of variance, and linear regression - Discussions of advanced analysis topics, including multiple linear regression and correlation, logistic regression, and survival analysis - Substantive exercises and case discussions at the end of each chapter

Perfect for postgraduate medical students, clinicians, and medical and biomedical researchers, Applied Medical Statistics will also earn a place on the shelf of any researcher with an interest in biostatistics or applying statistical methods to their own field of research.

Preface xiii

Acknowledgments xv

About the Companion Website xvii

1 What is Biostatistics 1

1.1 Overview 1

1.2 Some Statistical Terminology 2

1.2.1 Population and Sample 2

1.2.2 Homogeneity and Variation 3

1.2.3 Parameter and Statistic 4

1.2.4 Types of Data 4

1.2.5 Error 5

1.3 Workflow of Applied Statistics 6

1.4 Statistics and Its Related Disciplines 6

1.5 Statistical Thinking 7

1.6 Summary 7

1.7 Exercises 8

2 Descriptive Statistics 11

2.1 Frequency Tables and Graphs 12

2.1.1 Frequency Distribution of Numerical Data 12

2.1.2 Frequency Distribution of Categorical Data 16

2.2 Descriptive Statistics of Numerical Data 17

2.2.1 Measures of Central Tendency 17

2.2.2 Measures of Dispersion 26

2.3 Descriptive Statistics of Categorical Data 31

2.3.1 Relative Numbers 31

2.3.2 Standardization of Rates 34

2.4 Constructing Statistical Tables and Graphs 38

2.4.1 Statistical Tables 38

2.4.2 Statistical Graphs 40

2.5 Summary 47

2.6 Exercises 48

3 Fundamentals of Probability 53

3.1 Sample Space and Random Events 54

3.1.1 Definitions of Sample Space and Random Events 54

3.1.2 Operation of Events 55

3.2 Relative Frequency and Probability 58

3.2.1 Definition of Probability 59

3.2.2 Basic Properties of Probability 59

3.3 Conditional Probability and Independence of Events 60

3.3.1 Conditional Probability 60

3.3.2 Independence of Events 60

3.4 Multiplication Law of Probability 61

3.5 Addition Law of Probability 62

3.5.1 General Addition Law 62

3.5.2 Addition Law of Mutually Exclusive Events 62

3.6 Total Probability Formula and Bayes’ Rule 63

3.6.1 Total Probability Formula 63

3.6.2 Bayes’ Rule 64

3.7 Summary 65

3.8 Exercises 65

4 Discrete Random Variable 69

4.1 Concept of the Random Variable 69

4.2 Probability Distribution of the Discrete Random Variable 70

4.2.1 Probability Mass Function 70

4.2.2 Cumulative Distribution Function 71

4.2.3 Association Between the Probability Distribution and Relative Frequency Distribution 72

4.3 Numerical Characteristics 73

4.3.1 Expected Value 73

4.3.2 Variance and Standard Deviation 74

4.4 Commonly Used Discrete Probability Distributions 75

4.4.1 Binomial Distribution 75

4.4.2 Multinomial Distribution 80

4.4.3 Poisson Distribution 82

4.5 Summary 87

4.6 Exercises 87

5 Continuous Random Variable 91

5.1 Concept of Continuous Random Variable 92

5.2 Numerical Characteristics 93

5.3 Normal Distribution 94

5.3.1 Concept of the Normal Distribution 94

5.3.2 Standard Normal Distribution 96

5.3.3 Descriptive Methods for Assessing Normality 99

5.4 Application of the Normal Distribution 102

5.4.1 Normal Approximation to the Binomial Distribution 102

5.4.2 Normal Approximation to the Poisson Distribution 105

5.4.3 Determining the Medical Reference Interval 108

5.5 Summary 109

5.6 Exercises 110

6 Sampling Distribution and Parameter Estimation 113

6.1 Samples and Statistics 114

6.2 Sampling Distribution of a Statistic 114

6.2.1 Sampling Distribution of the Mean 115

6.2.2 Sampling Distribution of the Variance 120

6.2.3 Sampling Distribution of the Rate (Normal Approximation) 122

6.3 Estimation of One Population Parameter 124

6.3.1 Point Estimation and Its Quality Evaluation 124

6.3.2 Interval Estimation for the Mean 126

6.3.3 Interval Estimation for the Variance 130

6.3.4 Interval Estimation for the Rate (Normal Approximation Method) 131

6.4 Estimation of Two Population Parameters 132

6.4.1 Estimation of the Difference in Means 132

6.4.2 Estimation of the Ratio of Variances 136

6.4.3 Estimation of the Difference Between Rates (Normal Approximation Method) 139

6.5 Summary 141

6.6 Exercises 141

7 Hypothesis Testing for One Parameter 145

7.1 Overview 145

7.1.1 Concepts and Procedures 146

7.1.2 Type I and Type II Errors 150

7.1.3 One-sided and Two-sided Hypothesis 152

7.1.4 Association Between Hypothesis Testing and Interval Estimation 153

7.2 Hypothesis Testing for One Parameter 155

7.2.1 Hypothesis Tests for the Mean 155

7.2.1.1 Power of the Test 156

7.2.1.2 Sample Size Determination 160

7.2.2 Hypothesis Tests for the Rate (Normal Approximation Methods) 162

7.2.2.1 Power of the Test 163

7.2.2.2 Sample Size Determination 164

7.3 Further Considerations on Hypothesis Testing 164

7.3.1 About the Significance Level 164

7.3.2 Statistical Significance and Clinical Significance 165

7.4 Summary 165

7.5 Exercises 166

8 Hypothesis Testing for Two Population Parameters 169

8.1 Testing the Difference Between Two Population Means: Paired Samples 170

8.2 Testing the Difference Between Two Population Means: Independent Samples 173

8.2.1 t-Test for Means with Equal Variances 173

8.2.2 F-Test for the Equality of Two Variances 176

8.2.3 Approximation t-Test for Means with Unequal Variances 178

8.2.4 Z-Test for Means with Large-Sample Sizes 181

8.2.5 Power for Comparing Two Means 182

8.2.6 Sample Size Determination 183

8.3 Testing the Difference Between Two Population Rates (Normal Approximation Method) 185

8.3.1 Power for Comparing Two Rates 186

8.3.2 Sample Size Determination 187

8.4 Summary 188

8.5 Exercises 189

9 One-way Analysis of Variance 193

9.1 Overview 193

9.1.1 Concept of ANOVA 194

9.1.2 Data Layout and Modeling Assumption 195

9.2 Procedures of ANOVA 196

9.3 Multiple Comparisons of Means 204

9.3.1 Tukey’s Test 204

9.3.2 Dunnett’s Test 206

9.3.3 Least Significant Difference (LSD) Test 209

9.4 Checking ANOVA Assumptions 211

9.4.1 Check for Normality 211

9.4.2 Test for Homogeneity of Variances 213

9.4.2.1 Bartlett’s Test 213

9.4.2.2 Levene’s Test 215

9.5 Data Transformations 217

9.6 Summary 218

9.7 Exercises 218

10 Analysis of Variance in Different Experimental Designs 221

10.1 ANOVA for Randomized Block Design 221

10.1.1 Data Layout and Model Assumptions 223

10.1.2 Procedure of ANOVA 224

10.2 ANOVA for Two-factor Factorial Design 229

10.2.1 Concept of Factorial Design 230

10.2.2 Data Layout and Model Assumptions 233

10.2.3 Procedure of ANOVA 234

10.3 ANOVA for Repeated Measures Design 240

10.3.1 Characteristics of Repeated Measures Data 240

10.3.2 Data Layout and Model Assumptions 242

10.3.3 Procedure of ANOVA 243

10.3.4 Sphericity Test of Covariance Matrix 245

10.3.5 Multiple Comparisons of Means 248

10.4 ANOVA for 2 × 2 Crossover Design 251

10.4.1 Concept of a 2 × 2 Crossover Design 251

10.4.2 Data Layout and Model Assumptions 252

10.4.3 Procedure of ANOVA 254

10.5 Summary 256

10.6 Exercises 257

11 χ2 Test 261

11.1 Contingency Table 262

11.1.1 General Form of Contingency Table 263

11.1.2 Independence of Two Categorical Variables 264

11.1.3 Significance Testing Using the Contingency Table 265

11.2 χ2 Test for a 2 × 2 Contingency Table 266

11.2.1 Test of Independence 266

11.2.2 Yates’ Corrected χ2 test for a 2 × 2 Contingency Table 269

11.2.3 Paired Samples Design χ2 Test 269

11.2.4 Fisher’s Exact Tests for Completely Randomized Design 272

11.2.5 Exact McNemar’s Test for Paired Samples Design 275

11.3 χ2 Test for R × C Contingency Tables 276

11.3.1 Comparison of Multiple Independent Proportions 276

11.3.2 Multiple Comparisons of Proportions 278

11.4 χ2 Goodness-of-Fit Test 280

11.4.1 Normal Distribution Goodness-of-Fit Test 281

11.4.2 Poisson Distribution Goodness-of-Fit Test 283

11.5 Summary 284

11.6 Exercises 285

12 Nonparametric Tests Based on Rank 289

12.1 Concept of Order Statistics 289

12.2 Wilcoxon’s Signed-Rank Test for Paired Samples 290

12.3 Wilcoxon’s Rank-Sum Test for Two Independent Samples 295

12.4 Kruskal-Wallis Test for Multiple Independent Samples 299

12.4.1 Kruskal-Wallis Test 299

12.4.2 Multiple Comparisons 301

12.5 Friedman’s Test for Randomized Block Design 303

12.6 Further Considerations About Nonparametric Tests 306

12.7 Summary 306

12.8 Exercises 306

13 Simple Linear Regression 311

13.1 Concept of Simple Linear Regression 311

13.2 Establishment of Regression Model 314

13.2.1 Least Squares Estimation of a Regression Coefficient 314

13.2.2 Basic Properties of the Regression Model 316

13.2.3 Hypothesis Testing of Regression Model 317

13.3 Application of Regression Model 321

13.3.1 Confidence Interval Estimation of a Regression Coefficient 321

13.3.2 Confidence Band Estimation of Regression Model 322

13.3.3 Prediction Band Estimation of Individual Response Values 323

13.4 Evaluation of Model Fitting 325

13.4.1 Coefficient of Determination 325

13.4.2 Residual Analysis 326

13.5 Summary 327

13.6 Exercises 328

14 Simple Linear Correlation 331

14.1 Concept of Simple Linear Correlation 331

14.1.1 Definition of Correlation Coefficient 331

14.1.2 Interpretation of Correlation Coefficient 334

14.2 Hypothesis Testing of Correlation Coefficient 336

14.3 Confidence Interval Estimation for Correlation Coefficient 338

14.4 Spearman’s Rank Correlation 340

14.4.1 Concept of Spearman’s Rank Correlation Coefficient 340

14.4.2 Hypothesis Testing of Spearman’s Rank Correlation Coefficient 342

14.5 Summary 342

14.6 Exercises 343

15 Multiple Linear Regression 345

15.1 Multiple Linear Regression Model 346

15.1.1 Concept of the Multiple Linear Regression 346

15.1.2 Least Squares Estimation of Regression Coefficient 349

15.1.3 Properties of the Least Squares Estimators 351

15.1.4 Standardized Partial-Regression Coefficient 351

15.2 Hypothesis Testing 352

15.2.1 F-Test for Overall Regression Model 352

15.2.2 t-Test for Partial-Regression Coefficients 354

15.3 Evaluation of Model Fitting 356

15.3.1 Coefficient of Determination and Adjusted Coefficient of Determination 356

15.3.2 Residual Analysis and Outliers 357

15.4 Other Aspects of Regression 359

15.4.1 Multicollinearity 359

15.4.2 Selection of Independent Variables 361

15.4.3 Sample Size 364

15.5 Summary 364

15.6 Exercises 364

16 Logistic Regression 369

16.1 Logistic Regression Model 370

16.1.1 Linear Probability Model 371

16.1.2 Probability, Odds, and Logit Transformation 371

16.1.3 Definition of Logistic Regression 373

16.1.4 Inference for Logistic Regression 375

16.1.4.1 Estimation of Model Coefficient 375

16.1.4.2 Interpretation of Model Coefficient 378

16.1.4.3 Hypothesis Testing of Model Coefficient 380

16.1.4.4 Interval Estimation of Model Coefficient 382

16.1.5 Evaluation of Model Fitting 385

16.2 Conditional Logistic Regression Model 388

16.2.1 Characteristics of Conditional Logistic Regression Model 390

16.2.2 Estimation of Regression Coefficient 390

16.2.3 Hypothesis Testing of Regression Coefficient 393

16.3 Additional Remarks 394

16.3.1 Sample Size 394

16.3.2 Types of Independent Variables 394

16.3.3 Selection of Independent Variables 395

16.3.4 Missing Data 395

16.4 Summary 395

16.5 Exercises 396

17 Survival Analysis 399

17.1 Overview 400

17.1.1 Concept of Survival Analysis 400

17.1.2 Basic Functions of Survival Time 402

17.2 Description of the Survival Process 405

17.2.1 Product Limit Method 405

17.2.2 Life Table Method 408

17.3 Comparison of Survival Processes 410

17.3.1 Log-Rank Test 410

17.3.2 Other Methods for Comparing Survival Processes 413

17.4 Cox’s Proportional Hazards Model 414

17.4.1 Concept and Model Assumptions 415

17.4.2 Estimation of Model Coefficient 417

17.4.3 Hypothesis Testing of Model Coefficient 419

17.4.4 Evaluation of Model Fitting 420

17.5 Other Aspects of Cox’s Proportional Hazard Model 421

17.5.1 Hazard Index 421

17.5.2 Sample Size 421

17.6 Summary 422

17.7 Exercises 423

18 Evaluation of Diagnostic Tests 431

18.1 Basic Characteristics of Diagnostic Tests 431

18.1.1 Sensitivity and Specificity 433

18.1.2 Composite Measures of Sensitivity and Specificity 435

18.1.3 Predictive Values 438

18.1.4 Sensitivity and Specificity Comparison of Two Diagnostic Tests 440

18.2 Agreement Between Diagnostic Tests 443

18.2.1 Agreement of Categorical Data 444

18.2.2 Agreement of Numerical Data 447

18.3 Receiver Operating Characteristic Curve Analysis 448

18.3.1 Concept of an ROC Curve 449

18.3.2 Area Under the ROC Curve 450

18.3.3 Comparison of Areas Under ROC Curves 453

18.4 Summary 456

18.5 Exercises 457

19 Observational Study Design 461

19.1 Cross-Sectional Studies 462

19.1.1 Types of Cross-Sectional Studies 462

19.1.2 Probability Sampling Methods 462

19.1.3 Sample Size for Surveys 466

19.1.4 Cross-Sectional Studies for Clues of Etiology 468

19.2 Cohort Studies 469

19.2.1 Measures of Association in Cohort Studies 469

19.2.2 Sample Size for Cohort Studies 470

19.3 Case-Control Studies 472

19.3.1 Measures of Association in Case-Control Studies 472

19.3.2 Sample Size for Case-Control Studies 473

19.4 Summary 474

19.5 Exercises 475

20 Experimental Study Design 477

20.1 Overview 478

20.1.1 Basic Components of an Experimental Study 478

20.1.2 Principles of Experimental Study Design 480

20.1.3 Blinding Procedures in Clinical Trials 482

20.2 Completely Randomized Design 483

20.2.1 Concept of Completely Randomized Design 483

20.2.2 Sample Size for Completely Randomized Design 485

20.3 Randomized Block Design 486

20.3.1 Concepts of Randomized Block Design 486

20.3.2 Sample Size for Randomized Block Design 488

20.4 Factorial Design 489

20.5 Crossover Design 491

20.5.1 Concepts of Crossover Design 491

20.5.2 Sample Size for 2 × 2 Crossover Design 492

20.6 Summary 493

20.7 Exercises 493

Appendix 495

References 549

Index 557

Authors

Jingmei Jiang Chinese Academy of Medical Sciences (CAMS); School of Basic Medicine of Peking Union Medical College (PUMC).

Table of Contents

Authors

Related Topics

Related Products

Global Healthcare Statistics Databook Q2 2024: 300+ KPIs Covering Detailed Statistics on Patients, Healthcare Facilities, Public and Private Spending, Medical Staff - 20 Countries

Introductory Statistics, International Adaptation. Edition No. 10

Statistical Applications for the Behavioral and Social Sciences. Edition No. 2

Fundamentals of Mathematics in Medical Research: Theory and Cases

Epidemiology and Medical Statistics. Handbook of Statistics Volume 27