Master the fundamentals of regression without learning calculus with this one-stop resource
The newly and thoroughly revised 3rd Edition of Applied Regression Modeling delivers a concise but comprehensive treatment of the application of statistical regression analysis for those with little or no background in calculus. Accomplished instructor and author Dr. Iain Pardoe has reworked many of the more challenging topics, included learning outcomes and additional end-of-chapter exercises, and added coverage of several brand-new topics including multiple linear regression using matrices.
The methods described in the text are clearly illustrated with multi-format datasets available on the book's supplementary website. In addition to a fulsome explanation of foundational regression techniques, the book introduces modeling extensions that illustrate advanced regression strategies, including model building, logistic regression, Poisson regression, discrete choice models, multilevel models, Bayesian modeling, and time series forecasting. Illustrations, graphs, and computer software output appear throughout the book to assist readers in understanding and retaining the more complex content. Applied Regression Modeling covers a wide variety of topics, like:
- Simple linear regression models, including the least squares criterion, how to evaluate model fit, and estimation/prediction
- Multiple linear regression, including testing regression parameters, checking model assumptions graphically, and testing model assumptions numerically
- Regression model building, including predictor and response variable transformations, qualitative predictors, and regression pitfalls
- Three fully described case studies, including one each on home prices, vehicle fuel efficiency, and pharmaceutical patches
Perfect for students of any undergraduate statistics course in which regression analysis is a main focus, Applied Regression Modeling also belongs on the bookshelves of non-statistics graduate students, including MBAs, and for students of vocational, professional, and applied courses like data science and machine learning.
Table of Contents
Preface xi
Acknowledgments xv
Introduction xvii
I.1 Statistics in Practice xvii
I.2 Learning Statistics xix
About the Companion Website xxi
1 Foundations 1
1.1 Identifying and Summarizing Data 2
1.2 Population Distributions 5
1.3 Selecting Individuals at Random - Probability 9
1.4 Random Sampling 11
1.4.1 Central limit theorem - normal version 12
1.4.2 Central limit theorem - t-version 14
1.5 Interval Estimation 16
1.6 Hypothesis Testing 20
1.6.1 The rejection region method 20
1.6.2 The p-value method 23
1.6.3 Hypothesis test errors 27
1.7 Random Errors and Prediction 28
1.8 Chapter Summary 31
Problems 31
2 Simple Linear Regression 39
2.1 Probability Model for X and Y 40
2.2 Least Squares Criterion 45
2.3 Model Evaluation 50
2.3.1 Regression standard error 51
2.3.2 Coefficient of determination - R2 53
2.3.3 Slope parameter 57
2.4 Model Assumptions 65
2.4.1 Checking the model assumptions 66
2.4.2 Testing the model assumptions 72
2.5 Model Interpretation 72
2.6 Estimation and Prediction 74
2.6.1 Confidence interval for the population mean, E(Y) 74
2.6.2 Prediction interval for an individual Y -value 75
2.7 Chapter Summary 79
2.7.1 Review example 80
Problems 83
3 Multiple Linear Regression 95
3.1 Probability Model for (X1, X2, . . .) and Y 96
3.2 Least Squares Criterion 100
3.3 Model Evaluation 106
3.3.1 Regression standard error 106
3.3.2 Coefficient of determination - R2 108
3.3.3 Regression parameters - global usefulness test 115
3.3.4 Regression parameters - nested model test 120
3.3.5 Regression parameters - individual tests 127
3.4 Model Assumptions 137
3.4.1 Checking the model assumptions 137
3.4.2 Testing the model assumptions 143
3.5 Model Interpretation 145
3.6 Estimation and Prediction 146
3.6.1 Confidence interval for the population mean, E(Y ) 147
3.6.2 Prediction interval for an individual Y -value 148
3.7 Chapter Summary 151
Problems 152
4 Regression Model Building I 159
4.1 Transformations 161
4.1.1 Natural logarithm transformation for predictors 161
4.1.2 Polynomial transformation for predictors 167
4.1.3 Reciprocal transformation for predictors 171
4.1.4 Natural logarithm transformation for the response 175
4.1.5 Transformations for the response and predictors 179
4.2 Interactions 184
4.3 Qualitative Predictors 191
4.3.1 Qualitative predictors with two levels 192
4.3.2 Qualitative predictors with three or more levels 201
4.4 Chapter Summary 210
Problems 211
5 Regression Model Building II 221
5.1 Influential Points 223
5.1.1 Outliers 223
5.1.2 Leverage 228
5.1.3 Cook’s distance 230
5.2 Regression Pitfalls 234
5.2.1 Nonconstant variance 234
5.2.2 Autocorrelation 237
5.2.3 Multicollinearity 242
5.2.4 Excluding important predictor variables 246
5.2.5 Overfitting 249
5.2.6 Extrapolation 250
5.2.7 Missing data 252
5.2.8 Power and sample size 255
5.3 Model Building Guidelines 256
5.4 Model Selection 259
5.5 Model Interpretation Using Graphics 263
5.6 Chapter Summary 270
Problems 272
Notation and Formulas 287
Univariate Data 287
Simple Linear Regression 288
Multiple Linear Regression 289
Bibliography 293
Glossary 299
Index 305
6 Case studies 533
6.1 Home prices 533
6.1.1 Data description 533
6.1.2 Exploratory data analysis 536
6.1.3 Regression model building 539
6.1.4 Results and conclusions 542
6.1.5 Further questions 551
6.2 Vehicle fuel efficiency 552
6.2.1 Data description 552
6.2.2 Exploratory data analysis 554
6.2.3 Regression model building 556
6.2.4 Results and conclusions 557
6.2.5 Further questions 567
6.3 Pharmaceutical patches 568
6.3.1 Data description 568
6.3.2 Exploratory data analysis 569
6.3.3 Regression model building 570
6.3.4 Model diagnostics 573
6.3.5 Results and conclusions 574
6.3.6 Further questions 578
7 Extensions 579
7.1 Generalized linear models 581
7.1.1 Logistic regression 582
7.1.2 Poisson regression 594
7.2 Discrete choice models 602
7.3 Multilevel models 609
7.4 Bayesian modeling 614
7.4.1 Frequentist inference 614
7.4.2 Bayesian inference 616
Problems 620
A Computer software help 623
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
B Critical values for t-distributions 631
C Notation and formulas 635
C.1 Univariate data 635
C.2 Simple linear regression 637
C.3 Multiple linear regression 639
D Mathematics refresher 643
D.1 The natural logarithm and exponential functions 643
D.2 Rounding and accuracy 644
E Multiple Linear Regression Using Matrices 647
E.1 Vectors and matrices 647
E.2 Matrix multiplication 649
E.3 Matrix addition 652
E.4 Transpose of a matrix 654
E.5 Inverse of a matrix 656
E.6 Estimated multiple linear regression model equation 657
E.7 Least squares regression parameter estimates 659
E.8 Predicted or fitted values 661
E.9 Residuals and the regression standard error 663
E.10 Coefficient of determination 664
E.11 Regression parameter standard errors and t-statistics 665
E.12 Estimation and prediction 666
E.13 Leverages, standardized and studentized residuals, and Cook's distances 668
F Answers for selected problems 673