APPLIED BIOSTATISTICS FOR THE HEALTH SCIENCES
In this newly revised edition of Applied Biostatistics for the Health Sciences, accomplished statistician Dr. Richard Rossi delivers a robust and easy-to-understand exploration of statistics in the context of applied health science and biostatistics. The book covers sample design, logistic regression, experimental design, survival analysis, basic statistical computation, and many more topics with a strong focus on the correct use and interpretation of statistics. The author also explains how to assess the quality of observed data, how to collect quality data, and the use of confidence intervals in conjunction with hypothesis and significance tests.
- A thorough introduction to biostatistics, including explanations of fundamental concepts like populations, samples, statistics, biomedical studies, and data set examples
- A comprehensive exploration of population descriptions, including qualitative and quantitative variables, multivariate data, measures of dispersion, and probability
- Practical discussions of random sampling, summarizing random samples, and the measurement of the reliability of statistics
- In-depth examinations of confidence intervals, statistical hypothesis testing, simple and multiple linear regression, and experimental design
Perfect for health science and biostatistics students and professors at the upper undergraduate and graduate levels, Applied Biostatistics for the Health Sciences is also a must-read reference for practitioners and professionals in the fields of pharmacy, biochemistry, nursing, health care informatics, and the applied health sciences.
Table of Contents
Preface xi
Chapter 1 Introduction To Biostatistics 1
1.1 What is Biostatistics? 1
1.2 Populations, Samples, and Statistics 2
1.2.1 The Basic Biostatistical Terminology 3
1.2.2 Biomedical Studies 5
1.2.3 Observational Studies Versus Experiments 7
1.3 Clinical Trials 9
1.3.1 Safety and Ethical Considerations in a Clinical Trial 9
1.3.2 Types of Clinical Trials 10
1.3.3 The Phases of a Clinical Trial 10
1.4 Data Set Descriptions 12
1.4.1 Birth Weight Data Set 12
1.4.2 Body Fat Data Set 12
1.4.3 Coronary Heart Disease Data Set 13
1.4.4 Prostate Cancer Study Data Set 13
1.4.5 Intensive Care Unit Data Set 14
1.4.6 Mammography Experience Study Data Set 14
1.4.7 Benign Breast Disease Study 14
1.4.8 Exerbike Data Sets 15
Glossary 17
Exercises 19
Chapter 2 Describing Populations 24
2.1 Populations and Variables 24
2.1.1 Qualitative Variables 25
2.1.2 Quantitative Variables 26
2.1.3 Multivariate Data 28
2.2 Population Distributions and Parameters 29
2.2.1 Distributions 30
2.2.2 Describing a Population with Parameters 34
2.2.3 Proportions and Percentiles 35
2.2.4 Parameters Measuring Centrality 37
2.2.5 Measures of Dispersion 40
2.2.6 The Coefficient of Variation 43
2.2.7 Parameters for Bivariate Populations 45
2.3 Probability 48
2.3.1 Basic Probability Rules 50
2.3.2 Conditional Probability 52
2.3.3 Independence 54
2.3.4 The Relative Risk and the Odds Ratio 56
2.4 Probability Models 59
2.4.1 The Binomial Probability Model 59
2.4.2 The Normal Probability Model 62
2.4.3 Z Scores 69
Glossary 69
Exercises 71
Chapter 3 Random Sampling 83
3.1 Obtaining Representative Data 83
3.1.1 The Sampling Plan 85
3.1.2 Probability Samples 85
3.2 Commonly Used Sampling Plans 87
3.2.1 Simple Random Sampling 87
3.2.2 Stratified Random Sampling 91
3.2.3 Cluster Sampling 92
3.2.4 Systematic Sampling 94
3.3 Determining the Sample Size 95
3.3.1 The Sample Size for Simple and Systematic Random Samples 96
3.3.2 The Sample Size for a Stratified Random Sample 99
Glossary 105
Exercises 107
Chapter 4 Summarizing Random Samples 115
4.1 Samples and Inferential Statistics 115
4.2 Inferential Graphical Statistics 116
4.2.1 Bar and Pie Charts 116
4.2.2 Boxplots 120
4.2.3 Histograms 126
4.2.4 Normal Probability Plots 132
4.3 Numerical Statistics for Univariate Data Sets 134
4.3.1 Estimating Population Proportions 135
4.3.2 Estimating Population Percentiles 142
4.3.3 Estimating the Mean, Median, and Mode 143
4.3.4 Estimating the Variance and Standard Deviation 149
4.3.5 Linear Transformations 153
4.3.6 The Plug-in Rule for Estimation 156
4.4 Statistics for Multivariate Data Sets 158
4.4.1 Graphical Statistics for Bivariate Data Sets 158
4.4.2 Numerical Summaries for Bivariate Data Sets 160
4.4.3 Fitting Lines to Scatterplots 166
Glossary 167
Exercises 170
Chapter 5 Measuring The Reliability of Statistics 186
5.1 Sampling Distributions 186
5.1.1 Unbiased Estimators 188
5.1.2 Measuring the Accuracy of an Estimator 189
5.1.3 The Bound on the Error of Estimation 191
5.2 The Sampling Distribution of a Sample Proportion 192
5.2.1 The Mean and Standard Deviation of the Sampling Distribution of 𝑝̂ 192
5.2.2 Determining the Sample Size for a Prespecified Value of the Bound on the Error Estimation 195
5.2.3 The Central Limit Theorem for p 196
5.2.4 Some Final Notes on the Sampling Distribution of p 197
5.3 The Sampling Distribution of x 197
5.3.1 The Mean and Standard Deviation of the Sampling Distribution of x 198
5.3.2 Determining the Sample Size for a Prespecified Value of the Bound on the Error Estimation 201
5.3.3 The Central Limit Theorem for x 202
5.3.4 The t Distribution 204 5.3.5 Some Final Notes on the Sampling Distribution of x 206
5.4 Two Sample Comparisons 207
5.4.1 Comparing Two Population Proportions 208
5.4.2 Comparing Two Population Means 214
5.5 Bootstrapping the Sampling Distribution of a Statistic 220
Glossary 223
Exercises 223
Chapter 6 Confidence Intervals 235
6.1 Interval Estimation 235
6.2 Confidence Intervals 236
6.3 Single Sample Confidence Intervals 238
6.3.1 Confidence Intervals for Proportions 239
6.3.2 Confidence Intervals for a Mean 242
6.3.3 Large Sample Confidence Intervals for 𝜇 243
6.3.4 Small Sample Confidence Intervals for 𝜇 244
6.3.5 Determining the Sample Size for a Confidence Interval for the Mean 247
6.4 Bootstrap Confidence Intervals 248
6.5 Two Sample Comparative Confidence Intervals 250
6.5.1 Confidence Intervals for Comparing Two Proportions 250
6.5.2 Confidence Intervals for the Relative Risk 254
6.5.3 Confidence Intervals for the Odds Ratio 257
Glossary 259
Exercises 260
Chapter 7 Testing Statistical Hypotheses 272
7.1 Hypothesis Testing 272
7.1.1 The Components of a Hypothesis Test 272
7.1.2 P-Values and Significance Testing 279
7.2 Testing Hypotheses about Proportions 283
7.2.1 Single Sample Tests of a Population Proportion 283
7.2.2 Comparing Two Population Proportions 289
7.2.3 Tests of Independence 293
7.3 Testing Hypotheses About Means 301
7.3.1 t-Tests 301
7.3.2 t-Tests for the Mean of a Population 304
7.3.3 Paired Comparison t-Tests 308
7.3.4 Two Independent Sample t-Tests 313
7.4 7.4 Some Final Comments on Hypothesis Testing 318
Glossary 319
Exercises 320
Chapter 8 Simple Linear Regression 340
8.1 Bivariate Data, Scatterplots, and Correlation 340
8.1.1 Scatterplots 340
8.1.2 Correlation 343
8.2 The Simple Linear Regression Model 347
8.2.1 The Simple Linear Regression Model 348
8.2.2 Assumptions of the Simple Linear Regression Model 350
8.3 Fitting a Simple Linear Regression Model 352
8.4 Assessing the Assumptions and Fit of a Simple Linear Regression Model 354
8.4.1 Residuals 355
8.4.2 Residual Diagnostics 356
8.4.3 Estimating 𝜎 and Assessing the Strength of the Linear Relationship 362
8.5 Statistical Inferences based on a Fitted Model 366
8.5.1 Inferences About 𝛽0 366
8.5.2 Inferences About 𝛽1 368
8.6 Inferences about the Response Variable 370
8.6.1 Inferences About 𝜇Y|X371
8.6.2 Inferences for Predicting Values of Y 372
8.7 Model Validation 374
8.7.1 Selecting the Training and Validation Data Sets 374
8.7.2 Validating a Fitted Model 374
8.8 Some Final Comments on Simple Linear Regression 375
Glossary 377
Exercises 380
Chapter 9 Multiple Regression 396
9.1 Investigating Multivariate Relationships 398
9.2 The Multiple Linear Regression Model 400
9.2.1 The Assumptions of a Multiple Regression Model 401
9.3 Fitting a Multiple Linear Regression Model 403
9.4 Assessing the Assumptions of a Multiple Linear Regression Model 403
9.4.1 Residual Diagnostics 407
9.4.2 Detecting Multivariate Outliers and Influential Observations 413
9.5 Assessing the Adequacy of Fit of a Multiple Regression Model 414
9.5.1 Estimating 𝜎 414
9.5.2 The Coefficient of Determination 414
9.5.3 Multiple Regression Analysis of Variance 416
9.6 Statistical Inferences-Based Multiple Regression Model 419
9.6.1 Inferences about the Regression Coefficients 419
9.6.2 Inferences About the Response Variable 421
9.7 Comparing Multiple Regression Models 423
9.8 Multiple Regression Models with Categorical Variables 425
9.8.1 Regression Models with Dummy Variables 428
9.8.2 Testing the Importance of Categorical Variables 430
9.9 Variable Selection Techniques 434
9.9.1 Model Selection Using Maximum R2 adj 435
9.9.2 Model Selection using BIC 436
9.10 Model Validation 439
9.10.1 Selecting the Training and Validation Data Sets 440
9.10.2 Validating a Fitted Model 440
9.11 Some Final Comments on Multiple Regression 441
Glossary 442
Exercises 444
Chapter 10 Logistic Regression 462
10.1 The Logistic Regression Model 463
10.1.1 Assumptions of the Logistic Regression Model 466
10.2 Fitting a Logistic Regression Model 467
10.3 Assessing the Fit of a Logistic Regression Model 469
10.3.1 Checking the Assumptions of a Logistic Regression Model 470
10.3.2 Testing for the Goodness of Fit of a Logistic Regression Model 471
10.3.3 Model Diagnostics 473
10.4 Statistical Inferences Based on a Logistic Regression Model 478
10.4.1 Inferences about the Logistic Regression Coefficients 479
10.4.2 Comparing Models 480
10.5 Variable Selection 484
10.6 Classification with Logistic Regression 487
10.6.1 The Logistic Classifier 487
10.6.2 Misclassification Errors 488
10.7 Some Final Comments on Logistic Regression 489
Glossary 490
Exercises 492
Chapter 11 Design of Experiments 508
11.1 Experiments Versus Observational Studies 508
11.2 The Basic Principles of Experimental Design 511
11.2.1 Terminology 511
11.2.2 Designing an Experiment 512
11.3 Experimental Designs 514
11.3.1 The Completely Randomized Design 516
11.3.2 The Randomized Block Design 519
11.4 Factorial Experiments 521
11.4.1 Two-Factor Experiments 523
11.4.2 Three-Factor Experiments 525
11.5 Models for Designed Experiments 527
11.5.1 The Model for a Completely Randomized Design 527
11.5.2 The Model for a Randomized Block Design 528
11.5.3 Models for Experimental Designs with a Factorial Treatment Structure 530
11.6 Some Final Comments of Designed Experiments 531
Glossary 532
Exercises 534
Chapter 12 Analysis of Variance 542
12.1 Single-Factor Analysis of Variance 543
12.1.1 Partitioning the Total Experimental Variation 544
12.1.2 The Model Assumptions 546
12.1.3 The F-test 548
12.1.4 Comparing Treatment Means 550
12.2 Randomized Block Analysis of Variance 554
12.2.1 The ANOV Table for the Randomized Block Design 555
12.2.2 The Model Assumptions 557
12.2.3 The F-test 559
12.2.4 Separating the Treatment Means 560
12.3 Multi factor Analysis of Variance 563
12.3.1 Two-Factor Analysis of Variance 563
12.3.2 Three-Factor Analysis of Variance 571
12.4 Selecting the Number of Replicates in Analysis of Variance 575
12.4.1 Determining the Number of Replicates from the Power 575
12.4.2 Determining the Number of Replicates from 𝐷 576
12.5 Some Final Comments on Analysis of Variance 577
Glossary 578
Exercises 579
Chapter 13 Survival Analysis 596
13.1 The Kaplan-Meier Estimate of the Survival Function 597
13.2 The Proportional Hazards Model 603
13.3 Logistic Regression and Survival Analysis 607
13.4 Some Final Comments on Survival Analysis 609
Glossary 610
Exercises 611
References 620
Appendix A 628
Problem Solutions 636
Index 663