+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

Advanced Statistics with Applications in R. Edition No. 1. Wiley Series in Probability and Statistics

  • Book

  • 880 Pages
  • December 2019
  • John Wiley and Sons Ltd
  • ID: 5824518

Advanced Statistics with Applications in R fills the gap between several excellent theoretical statistics textbooks and many applied statistics books where teaching reduces to using existing packages. This book looks at what is under the hood. Many statistics issues including the recent crisis with p-value are caused by misunderstanding of statistical concepts due to poor theoretical background of practitioners and applied statisticians. This book is the product of a forty-year experience in teaching of probability and statistics and their applications for solving real-life problems.

There are more than 442 examples in the book: basically every probability or statistics concept is illustrated with an example accompanied with an R code. Many examples, such as Who said π? What team is better? The fall of the Roman empire, James Bond chase problem, Black Friday shopping, Free fall equation: Aristotle or Galilei, and many others are intriguing. These examples cover biostatistics, finance, physics and engineering, text and image analysis, epidemiology, spatial statistics, sociology, etc.

Advanced Statistics with Applications in R teaches students to use theory for solving real-life problems through computations: there are about 500 R codes and 100 datasets. These data can be freely downloaded from the author's website dartmouth.edu/~eugened.

This book is suitable as a text for senior undergraduate students with major in statistics or data science or graduate students. Many researchers who apply statistics on the regular basis find explanation of many fundamental concepts from the theoretical perspective illustrated by concrete real-world applications. 

Table of Contents

Why I Wrote This Book

1 Discrete random variables 1

1.1 Motivating example 1

1.2 Bernoulli random variable 2

1.3 General discrete random variable 4

1.4 Mean and variance 6

1.4.1 Mechanical interpretation of the mean 7

1.4.2 Variance 12

1.5 R basics 15

1.5.1 Scripts/functions 16

1.5.2 Text editing in R 17

1.5.3 Saving your R code 18

1.5.4 for loop 18

1.5.5 Vectorized computations 19

1.5.6 Graphics 23

1.5.7 Coding and help in R 25

1.6 Binomial distribution 26

1.7 Poisson distribution 32

1.8 Random number generation using sample 38

1.8.1 Generation of a discrete random variable 38

1.8.2 Random Sudoku 39

2 Continuous random variables 43

2.1 Distribution and density functions 43

2.1.1 Cumulative distribution function 43

2.1.2 Empirical cdf 45

2.1.3 Density function 46

2.2 Mean, variance, and other moments 48

2.2.1 Quantiles, quartiles, and the median 54

2.2.2 The tight confidence range 55

2.3 Uniform distribution 59

2.4 Exponential distribution 63

2.4.1 Laplace or double-exponential distribution 67

2.4.2 R functions 67

2.5 Moment generating function 69

2.5.1 Fourier transform and characteristic function 72

2.6 Gamma distribution 75

2.6.1 Relationship to Poisson distribution 77

2.6.2 Computing the gamma distribution in R 79

2.6.3 The tight confidence range 79

2.7 Normal distribution 82

2.8 Chebyshev’s inequality 91

2.9 The law of large numbers 93

2.9.1 Four types of stochastic convergence 94

2.9.2 Integral approximation using simulations 99

2.10 The central limit theorem 104

2.10.1 Why the normal distribution is the most natural symmetric distribution 112

2.10.2 CLT on the relative scale 113

2.11 Lognormal distribution 116

2.11.1 Computation of the tight confidence range 118

2.12 Transformations and the delta method 120

2.12.1 The delta method 124

2.13 Random number generation 126

2.13.1 Cauchy distribution 130

2.14 Beta distribution 132

2.15 Entropy 134

2.16 Benford’s law: the distribution of the first digit 138

2.16.1 Distributions that almost obey Benford’s law 142

2.17 The Pearson family of distributions 145

2.18 Major univariate continuous distributions 147

3 Multivariate random variables 149

3.1 Joint cdf and density 149

3.1.1 Expectation 154

3.1.2 Bivariate discrete distribution 154

3.2 Independence 156

3.2.1 Convolution 159

3.3 Conditional density 168

3.3.1 Conditional mean and variance 171

3.3.2 Mixture distribution and Bayesian statistics 179

3.3.3 Random sum 182

3.3.4 Cancer tumors grow exponentially 184

3.4 Correlation and linear regression 189

3.5 Bivariate normal distribution 198

3.5.1 Regression as conditional mean 206

3.5.2 Variance decomposition and coefficient of determination 208

3.5.3 Generation of dependent normal observations 209

3.5.4 Copula 214

3.6 Joint density upon transformation 218

3.7 Geometric probability 223

3.7.1 Meeting problem 224

3.7.2 Random objects on the square 225

3.8 Optimal portfolio allocation 230

3.8.1 Stocks do not correlate 231

3.8.2 Correlated stocks 232

3.8.3 Markowitz bullet 233

3.8.4 Probability bullet 234

3.9 Distribution of order statistics 236

3.10 Multidimensional random vectors 239

3.10.1 Multivariate conditional distribution 245

3.10.2 Multivariate MGF 247

3.10.3 Multivariate delta method 248

3.10.4 Multinomial distribution 251

4 Four important distributions in statistics 255

4.1 Multivariate normal distribution 255

4.1.1 Generation of multivariate normal variables 259

4.1.2 Conditional distribution 261

4.1.3 Multivariate CLT 268

4.2 Chi-square distribution 270

4.2.1 Noncentral chi-square distribution 276

4.2.2 Expectations and variances of quadratic forms 277

4.2.3 Kronecker product and covariance matrix 277

4.3 t-distribution 280

4.3.1 Noncentral t-distribution 284

4.4 F-distribution 286

5 Preliminary data analysis and visualization 291

5.1 Comparison of random variables using the cdf 291

5.1.1 ROC curve 294

5.1.2 Survival probability 305

5.2 Histogram 312

5.3 Q-Q plot 315

5.3.1 The q-q confidence bands 319

5.4 Box plot 324

5.5 Kernel density estimation 325

5.5.1 Density movie 331

5.5.2 3D scatterplots 333

5.6 Bivariate normal kernel density 335

5.6.1 Bivariate kernel smoother for images 339

5.6.2 Smoothed scatterplot 341

5.6.3 Spatial statistics for disease mapping 342

6 Parameter estimation 347

6.1 Statistics as inverse probability 349

6.2 Method of moments 350

6.2.1 Generalized method of moments 353

6.3 Method of quantiles 357

6.4 Statistical properties of an estimator 358

6.4.1 Unbiasedness 359

6.4.2 Mean Square Error 365

6.4.3 Multidimensional MSE 371

6.4.4 Consistency of estimators 373

6.5 Linear estimation 378

6.5.1 Estimation of the mean using linear estimator 379

6.5.2 Vector representation 383

6.6 Estimation of variance and correlation coefficient 385

6.6.1 Quadratic estimation of the variance 386

6.6.2 Estimation of the covariance and correlation coefficient 389

6.7 Least squares for simple linear regression 398

6.7.1 Gauss - Markov theorem 402

6.7.2 Statistical properties of the OLS estimator under the normal assumption 404

6.7.3 The lm function and prediction by linear regression 406

6.7.4 Misinterpretation of the coefficient of determination 410

6.8 Sufficient statistics and the exponential family of distributions 415

6.8.1 Uniformly minimum-variance unbiased estimator 419

6.8.2 Exponential family of distributions 422

6.9 Fisher information and the Cramér - Rao bound 433

6.9.1 One parameter 434

6.9.2 Multiple parameters 440

6.10 Maximum likelihood 453

6.10.1 Basic definitions and examples 453

6.10.2 Circular statistics and the von Mises distribution 471

6.10.3 Maximum likelihood, sufficient statistics and the exponential family 475

6.10.4 Asymptotic properties of ML 477

6.10.5 When maximum likelihood breaks down 485

6.10.6 Algorithms for log-likelihood function maximization 498

6.11 Estimating equations and the M-estimator 510

6.11.1 Robust statistics 516

7 Hypothesis testing and confidence intervals 523

7.1 Fundamentals of statistical testing 523

7.1.1 The p-value and its interpretation 525

7.1.2 Ad hoc statistical testing 528

7.2 Simple hypothesis 531

7.3 The power function of the Z-test 536

7.3.1 Type II error and the power function 536

7.3.2 Optimal significance level and the ROC curve 542

7.3.3 One-sided hypothesis 545

7.4 The t-test for the means 549

7.4.1 One-sample t-test 549

7.4.2 Two-sample t-test 552

7.4.3 One-sided t-test 557

7.4.4 Paired versus unpaired t-test 558

7.4.5 Parametric versus nonparametric tests 560

7.5 Variance test 562

7.5.1 Two-sided variance test 562

7.5.2 One-sided variance test 565

7.6 Inverse-cdf test 566

7.6.1 General formulation 567

7.6.2 The F-test for variances 569

7.6.3 Binomial proportion 573

7.6.4 Poisson rate 577

7.7 Testing for correlation coefficient 580

7.8 Confidence interval 583

7.8.1 Unbiased CI and its connection to hypothesis testing 588

7.8.2 Inverse cdf CI 589

7.8.3 CI for the normal variance and SD 591

7.8.4 CI for other major statistical parameters 592

7.8.5 Confidence region 594

7.9 Three asymptotic tests and confidence intervals 597

7.9.1 Pearson chi-square test 605

7.9.2 Handwritten digit recognition 608

7.10 Limitations of classical hypothesis testing and the d-value 612

7.10.1 What the p-value means? 613

7.10.2 Why α = 0.05? 614

7.10.3 The null hypothesis is always rejected with a large enough sample size 616

7.10.4 Parameter-based inference 618

7.10.5 The d-value for individual inference 619

8 Linear model and its extensions 627

8.1 Basic definitions and linear least squares 627

8.1.1 Linear model with the intercept term 632

8.1.2 The vector-space geometry of least squares 633

8.1.3 Coefficient of determination 636

8.2 The Gauss - Markov theorem 639

8.2.1 Estimation of regression variance 641

8.3 Properties of OLS estimators under the normal assumption 643

8.3.1 The sensitivity of statistical inference to violation of the normal assumption 646

8.4 Statistical inference with linear models 650

8.4.1 Confidence interval and region 650

8.4.2 Linear hypothesis testing and the F-test 653

8.4.3 Prediction by linear regression and simultaneous confidence band 661

8.4.4 Testing the null hypothesis and the coefficient of determination 664

8.4.5 Is X fixed or random? 665

8.5 The one-sided p- and d-value for regression coefficients 671

8.5.1 The one-sided p-value for interpretation on the population level 672

8.5.2 The d-value for interpretation on the individual level 673

8.6 Examples and pitfalls 676

8.6.1 Kids drinking and alcohol movie watching 676

8.6.2 My first false discovery 680

8.6.3 Height, foot, and nose regression 681

8.6.4 A geometric interpretation of adding a new predictor 684

8.6.5 Contrast coefficient of determination against spurious regression 687

8.7 Dummy variable approach and ANOVA 696

8.7.1 Dummy variables for categories 696

8.7.2 Unpaired and paired t-test 705

8.7.3 Modeling longitudinal data 708

8.7.4 One-way ANOVA model 712

8.7.5 Two-way ANOVA 720

8.8 Generalized linear model 723

8.8.1 MLE estimation of GLM 727

8.8.2 Logistic and probit regressions for binary outcome 728

8.8.3 Poisson regression 736

9 Nonlinear regression 741

9.1 Definition and motivating examples 741

9.2 Nonlinear least squares 750

9.3 Gauss - Newton algorithm 753

9.4 Statistical properties of the NLS estimator 757

9.4.1 Large sample properties 757

9.4.2 Small sample properties 762

9.4.3 Asymptotic confidence intervals and hypothesis testing 763

9.4.4 Three methods of statistical inference in large sample 768

9.5 The nls function and examples 770

9.5.1 NLS-cdf estimator 782

9.6 Studying small sample properties through simulations 786

9.6.1 Normal distribution approximation 787

9.6.2 Statistical tests 789

9.6.3 Confidence region 791

9.6.4 Confidence intervals 792

9.7 Numerical complications of the nonlinear least squares 794

9.7.1 Criteria for existence 795

9.7.2 Criteria for uniqueness 796

9.8 Optimal design of experiments with nonlinear regression 799

9.8.1 Motivating examples 799

9.8.2 Optimal designs with nonlinear regression 802

9.9 The Michaelis - Menten model 805

9.9.1 The NLS solution 806

9.9.2 The exact solution 807

10 Appendix 811

10.1 Notation 811

10.2 Basics of matrix algebra 811

10.2.1 Preliminaries and matrix inverse 812

10.2.2 Determinant 815

10.2.3 Partition matrices 816

10.3 Eigenvalues and eigenvectors 818

10.3.1 Jordan spectral matrix decomposition 819

10.3.2 SVD: Singular value decomposition of a rectangular matrix 820

10.4 Quadratic forms and positive definite matrices 822

10.4.1 Quadratic forms 822

10.4.2 Positive and nonnegative definite matrices 823

10.5 Vector and matrix calculus 826

10.5.1 Differentiation of a scalar-valued function with respect to a vector 826

10.5.2 Differentiation of a vector-valued function with respect to a vector 827

10.5.3 Kronecker product 828

10.5.4 vec operator 828

10.6 Optimization 829

10.6.1 Convex and concave functions 830

10.6.2 Criteria for unconstrained minimization 831

10.6.3 Gradient algorithms 835

10.6.4 Constrained optimization: Lagrange multiplier technique 838

Bibliography 843

Index 851

Authors

Eugene Demidenko Dartmouth Medical School, Lebanon, NH, USA.