Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP. Edition No. 2


Book
1040 Pages
March 2020
John Wiley and Sons Ltd
ID: 5840417

Introduces basic concepts in probability and statistics to data science students, as well as engineers and scientists

Aimed at undergraduate/graduate-level engineering and natural science students, this timely, fully updated edition of a popular book on statistics and probability shows how real-world problems can be solved using statistical concepts. It removes Excel exhibits and replaces them with R software throughout, and updates both MINITAB and JMP software instructions and content. A new chapter discussing data mining - including big data, classification, machine learning, and visualization - is featured. Another new chapter covers cluster analysis methodologies in hierarchical, nonhierarchical, and model based clustering. The book also offers a chapter on Response Surfaces that previously appeared on the book’s companion website.

Statistics and Probability with Applications for Engineers and Scientists using MINITAB, R and JMP, Second Edition is broken into two parts. Part I covers topics such as: describing data graphically and numerically, elements of probability, discrete and continuous random variables and their probability distributions, distribution functions of random variables, sampling distributions, estimation of population parameters and hypothesis testing. Part II covers: elements of reliability theory, data mining, cluster analysis, analysis of categorical data, nonparametric tests, simple and multiple linear regression analysis, analysis of variance, factorial designs, response surfaces, and statistical quality control (SQC) including phase I and phase II control charts. The appendices contain statistical tables and charts and answers to selected problems.

Features two new chapters - one on Data Mining and another on Cluster Analysis
Now contains R exhibits including code, graphical display, and some results
MINITAB and JMP have been updated to their latest versions
Emphasizes the p-value approach and includes related practical interpretations
Offers a more applied statistical focus, and features modified examples to better exhibit statistical concepts
Supplemented with an Instructor's-only solutions manual on a book’s companion website

Statistics and Probability with Applications for Engineers and Scientists using MINITAB, R and JMP is an excellent text for graduate level data science students, and engineers and scientists. It is also an ideal introduction to applied statistics and probability for undergraduate students in engineering and the natural sciences.

Preface xvii

Acknowledgments xxi

About The Companion Site xxiii

1 Introduction 1

1.1 Designed Experiment 2

1.1.1 Motivation for the Study 2

1.1.2 Investigation 3

1.1.3 Changing Criteria 3

1.1.4 A Summary of the Various Phases of the Investigation 5

1.2 A Survey 6

1.3 An Observational Study 6

1.4 A Set of Historical Data 7

1.5 A Brief Description of What is Covered in this Book 7

Part I Fundamentals of Probability and Statistics

2 Describing Data Graphically and Numerically 13

2.1 Getting Started with Statistics 14

2.1.1 What is Statistics? 14

2.1.2 Population and Sample in a Statistical Study 14

2.2 Classification of Various Types of Data 18

2.2.1 Nominal Data 18

2.2.2 Ordinal Data 19

2.2.3 Interval Data 19

2.2.4 Ratio Data 19

2.3 Frequency Distribution Tables for Qualitative and Quantitative Data 20

2.3.1 Qualitative Data 21

2.3.2 Quantitative Data 24

2.4 Graphical Description of Qualitative and Quantitative Data 30

2.4.1 Dot Plot 30

2.4.2 Pie Chart 31

2.4.3 Bar Chart 33

2.4.4 Histograms 37

2.4.5 Line Graph 44

2.4.6 Stem-and-Leaf Plot 45

2.5 Numerical Measures of Quantitative Data 50

2.5.1 Measures of Centrality 51

2.5.2 Measures of Dispersion 56

2.6 Numerical Measures of Grouped Data 67

2.6.1 Mean of a Grouped Data 67

2.6.2 Median of a Grouped Data 68

2.6.3 Mode of a Grouped Data 69

2.6.4 Variance of a Grouped Data 69

2.7 Measures of Relative Position 70

2.7.1 Percentiles 71

2.7.2 Quartiles 72

2.7.3 Interquartile Range (IQR) 72

2.7.4 Coefficient of Variation 73

2.8 Box-Whisker Plot 75

2.8.1 Construction of a Box Plot 75

2.8.2 How to Use the Box Plot 76

2.9 Measures of Association 80

2.10 Case Studies 84

2.10.1 About St. Luke’s Hospital 85

2.11 Using JMP 86

Review Practice Problems 87

3 Elements of Probability 97

3.1 Introduction 97

3.2 Random Experiments, Sample Spaces, and Events 98

3.2.1 Random Experiments and Sample Spaces 98

3.2.2 Events 99

3.3 Concepts of Probability 103

3.4 Techniques of Counting Sample Points 108

3.4.1 Tree Diagram 108

3.4.2 Permutations 110

3.4.3 Combinations 110

3.4.4 Arrangements of n Objects Involving Several Kinds of Objects 111

3.5 Conditional Probability 113

3.6 Bayes’s Theorem 116

3.7 Introducing Random Variables 120

Review Practice Problems 122

4 Discrete Random Variables and Some Important Discrete Probability Distributions 128

4.1 Graphical Descriptions of Discrete Distributions 129

4.2 Mean and Variance of a Discrete Random Variable 130

4.2.1 Expected Value of Discrete Random Variables and Their Functions 130

4.2.2 The Moment-Generating Function-Expected Value of a Special Function of X 133

4.3 The Discrete Uniform Distribution 136

4.4 The Hypergeometric Distribution 137

4.5 The Bernoulli Distribution 141

4.6 The Binomial Distribution 142

4.7 The Multinomial Distribution 146

4.8 The Poisson Distribution 147

4.8.1 Definition and Properties of the Poisson Distribution 147

4.8.2 Poisson Process 148

4.8.3 Poisson Distribution as a Limiting Form of the Binomial 148

4.9 The Negative Binomial Distribution 153

4.10 Some Derivations and Proofs (Optional) 156

4.11 A Case Study 156

4.12 Using JMP 157

Review Practice Problems 157

5 Continuous Random Variables and Some Important Continuous Probability Distributions 164

5.1 Continuous Random Variables 165

5.2 Mean and Variance of Continuous Random Variables 168

5.2.1 Expected Value of Continuous Random Variables and Their Functions 168

5.2.2 The Moment-Generating Function and Expected Value of a Special Function of X 171

5.3 Chebyshev’s Inequality 173

5.4 The Uniform Distribution 175

5.4.1 Definition and Properties 175

5.4.2 Mean and Standard Deviation of the Uniform Distribution 178

5.5 The Normal Distribution 180

5.5.1 Definition and Properties 180

5.5.2 The Standard Normal Distribution 182

5.5.3 The Moment-Generating Function of the Normal Distribution 187

5.6 Distribution of Linear Combination of Independent Normal Variables 189

5.7 Approximation of the Binomial and Poisson Distributions by the Normal Distribution 193

5.7.1 Approximation of the Binomial Distribution by the Normal Distribution 193

5.7.2 Approximation of the Poisson Distribution by the Normal Distribution 196

5.8 A Test of Normality 196

5.9 Probability Models Commonly used in Reliability Theory 201

5.9.1 The Lognormal Distribution 202

5.9.2 The Exponential Distribution 206

5.9.3 The Gamma Distribution 211

5.9.4 The Weibull Distribution 214

5.10 A Case Study 218

5.11 Using JMP 219

Review Practice Problems 220

6 Distribution of Functions Of Random Variables 228

6.1 Introduction 229

6.2 Distribution Functions of Two Random Variables 229

6.2.1 Case of Two Discrete Random Variables 229

6.2.2 Case of Two Continuous Random Variables 232

6.2.3 The Mean Value and Variance of Functions of Two Random Variables 233

6.2.4 Conditional Distributions 235

6.2.5 Correlation between Two Random Variables 238

6.2.6 Bivariate Normal Distribution 241

6.3 Extension to Several Random Variables 244

6.4 The Moment-Generating Function Revisited 245

Review Practice Problems 249

7 Sampling Distributions 253

7.1 Random Sampling 253

7.1.1 Random Sampling from an Infinite Population 254

7.1.2 Random Sampling from a Finite Population 256

7.2 The Sampling Distribution of the Sample Mean 258

7.2.1 Normal Sampled Population 258

7.2.2 Nonnormal Sampled Population 258

7.2.3 The Central Limit Theorem 259

7.3 Sampling from a Normal Population 264

7.3.1 The Chi-Square Distribution 264

7.3.2 The Student t-Distribution 271

7.3.3 Snedecor’s F-Distribution 276

7.4 Order Statistics 279

7.4.1 Distribution of the Largest Element in a Sample 280

7.4.2 Distribution of the Smallest Element in a Sample 281

7.4.3 Distribution of the Median of a Sample and of the kth Order Statistic 282

7.4.4 Other Uses of Order Statistics 284

7.5 Using JMP 286

Review Practice Problems 286

8 Estimation of Population Parameters 289

8.1 Introduction 290

8.2 Point Estimators for the Population Mean and Variance 290

8.2.1 Properties of Point Estimators 292

8.2.2 Methods of Finding Point Estimators 295

8.3 Interval Estimators for the Mean μ of a Normal Population 301

8.3.1 σ2 Known 301

8.3.2 σ2 Unknown 304

8.3.3 Sample Size is Large 306

8.4 Interval Estimators for The Difference of Means of Two Normal Populations 313

8.4.1 Variances are Known 313

8.4.2 Variances are Unknown 314

8.5 Interval Estimators for the Variance of a Normal Population 322

8.6 Interval Estimator for the Ratio of Variances of Two Normal Populations 327

8.7 Point and Interval Estimators for the Parameters of Binomial Populations 331

8.7.1 One Binomial Population 331

8.7.2 Two Binomial Populations 334

8.8 Determination of Sample Size 338

8.8.1 One Population Mean 339

8.8.2 Difference of Two Population Means 339

8.8.3 One Population Proportion 340

8.8.4 Difference of Two Population Proportions 341

8.9 Some Supplemental Information 343

8.10 A Case Study 343

8.11 Using JMP 343

Review Practice Problems 344

9 Hypothesis Testing 352

9.1 Introduction 353

9.2 Basic Concepts of Testing a Statistical Hypothesis 353

9.2.1 Hypothesis Formulation 353

9.2.2 Risk Assessment 355

9.3 Tests Concerning the Mean of a Normal Population Having Known Variance 358

9.3.1 Case of a One-Tail (Left-Sided) Test 358

9.3.2 Case of a One-Tail (Right-Sided) Test 362

9.3.3 Case of a Two-Tail Test 363

9.4 Tests Concerning the Mean of a Normal Population Having Unknown Variance 372

9.4.1 Case of a Left-Tail Test 372

9.4.2 Case of a Right-Tail Test 373

9.4.3 The Two-Tail Case 374

9.5 Large Sample Theory 378

9.6 Tests Concerning the Difference of Means of Two Populations Having Distributions with Known Variances 380

9.6.1 The Left-Tail Test 380

9.6.2 The Right-Tail Test 381

9.6.3 The Two-Tail Test 383

9.7 Tests Concerning the Difference of Means of Two Populations Having Normal Distributions with Unknown Variances 388

9.7.1 Two Population Variances are Equal 388

9.7.2 Two Population Variances are Unequal 392

9.7.3 The Paired t-Test 395

9.8 Testing Population Proportions 401

9.8.1 Test Concerning One Population Proportion 401

9.8.2 Test Concerning the Difference Between Two Population Proportions 405

9.9 Tests Concerning the Variance of a Normal Population 410

9.10 Tests Concerning the Ratio of Variances of Two Normal Populations 414

9.11 Testing of Statistical Hypotheses using Confidence Intervals 418

9.12 Sequential Tests of Hypotheses 422

9.12.1 A One-Tail Sequential Testing Procedure 422

9.12.2 A Two-Tail Sequential Testing Procedure 427

9.13 Case Studies 430

9.14 Using JMP 431

Review Practice Problems 431

Part II Statistics in Actions

10 Elements of Reliability Theory 445

10.1 The Reliability Function 446

10.1.1 The Hazard Rate Function 446

10.1.2 Employing the Hazard Function 455

10.2 Estimation: Exponential Distribution 457

10.3 Hypothesis Testing: Exponential Distribution 465

10.4 Estimation: Weibull Distribution 467

10.5 Case Studies 472

10.6 Using JMP 474

Review Practice Problems 474

11 On Data Mining 476

11.1 Introduction 476

11.2 What is Data Mining? 477

11.2.1 Big Data 477

11.3 Data Reduction 478

11.4 Data Visualization 481

11.5 Data Preparation 490

11.5.1 Missing Data 490

11.5.2 Outlier Detection and Remedial Measures 491

11.6 Classification 492

11.6.1 Evaluating a Classification Model 493

11.7 Decision Trees 499

11.7.1 Classification and Regression Trees (CART) 500

11.7.2 Further Reading 511

11.8 Case Studies 511

11.9 Using JMP 512

Review Practice Problems 512

12 Cluster Analysis 518

12.1 Introduction 518

12.2 Similarity Measures 519

12.2.1 Common Similarity Coefficients 524

12.3 Hierarchical Clustering Methods 525

12.3.1 Single Linkage 526

12.3.2 Complete Linkage 531

12.3.3 Average Linkage 534

12.3.4 Ward’s Hierarchical Clustering 536

12.4 Nonhierarchical Clustering Methods 538

12.4.1 K-Means Method 538

12.5 Density-Based Clustering 544

12.6 Model-Based Clustering 547

12.7 A Case Study 552

12.8 Using JMP 553

Review Practice Problems 553

13 Analysis of Categorical Data 558

13.1 Introduction 558

13.2 The Chi-Square Goodness-of-Fit Test 559

13.3 Contingency Tables 568

13.3.1 The 2 × 2 Case with Known Parameters 568

13.3.2 The 2 × 2 Case with Unknown Parameters 570

13.3.3 The r × s Contingency Table 572

13.4 Chi-Square Test for Homogeneity 577

13.5 Comments on the Distribution of the Lack-of-Fit Statistics 581

13.6 Case Studies 583

13.7 Using JMP 584

Review Practice Problems 585

14 Nonparametric Tests 591

14.1 Introduction 591

14.2 The Sign Test 592

14.2.1 One-Sample Test 592

14.2.2 The Wilcoxon Signed-Rank Test 595

14.2.3 Two-Sample Test 598

14.3 Mann-Whitney (Wilcoxon) W Test for Two Samples 604

14.4 Runs Test 608

14.4.1 Runs above and below the Median 608

14.4.2 The Wald-Wolfowitz Run Test 611

14.5 Spearman Rank Correlation 614

14.6 Using JMP 618

Review Practice Problems 618

15 Simple Linear Regression Analysis 622

15.1 Introduction 623

15.2 Fitting the Simple Linear Regression Model 624

15.2.1 Simple Linear Regression Model 624

15.2.2 Fitting a Straight Line by Least Squares 627

15.2.3 Sampling Distribution of the Estimators of Regression Coefficients 631

15.3 Unbiased Estimator of σ2 637

15.4 Further Inferences Concerning Regression Coefficients (β0, β1), E(Y ), and Y 639

15.4.1 Confidence Interval for β1 with Confidence Coefficient (1 - α) 639

15.4.2 Confidence Interval for β0 with Confidence Coefficient (1 - α) 640

15.4.3 Confidence Interval for E(Y |X) with Confidence Coefficient (1 - α) 642

15.4.4 Prediction Interval for a Future Observation Y with Confidence Coefficient (1 - α) 645

15.5 Tests of Hypotheses for β0 and β1 652

15.5.1 Test of Hypotheses for β1 652

15.5.2 Test of Hypotheses for β0 652

15.6 Analysis of Variance Approach to Simple Linear Regression Analysis 659

15.7 Residual Analysis 665

15.8 Transformations 674

15.9 Inference About ρ 681

15.10A Case Study 683

15.11 Using JMP 684

Review Practice Problems 684

16 Multiple Linear Regression Analysis 693

16.1 Introduction 694

16.2 Multiple Linear Regression Models 694

16.3 Estimation of Regression Coefficients 699

16.3.1 Estimation of Regression Coefficients Using Matrix Notation 701

16.3.2 Properties of the Least-Squares Estimators 703

16.3.3 The Analysis of Variance Table 704

16.3.4 More Inferences about Regression Coefficients 706

16.4 Multiple Linear Regression Model Using Quantitative and Qualitative Predictor Variables 714

16.4.1 Single Qualitative Variable with Two Categories 714

16.4.2 Single Qualitative Variable with Three or More Categories 716

16.5 Standardized Regression Coefficients 726

16.5.1 Multicollinearity 728

16.5.2 Consequences of Multicollinearity 729

16.6 Building Regression Type Prediction Models 730

16.6.1 First Variable to Enter into the Model 730

16.7 Residual Analysis and Certain Criteria for Model Selection 734

16.7.1 Residual Analysis 734

16.7.2 Certain Criteria for Model Selection 735

16.8 Logistic Regression 740

16.9 Case Studies 745

16.10 Using JMP 748

Review Practice Problems 748

17 Analysis of Variance 757

17.1 Introduction 758

17.2 The Design Models 758

17.2.1 Estimable Parameters 758

17.2.2 Estimable Functions 760

17.3 One-Way Experimental Layouts 761

17.3.1 The Model and Its Analysis 761

17.3.2 Confidence Intervals for Treatment Means 767

17.3.3 Multiple Comparisons 773

17.3.4 Determination of Sample Size 780

17.3.5 The Kruskal-Wallis Test for One-Way Layouts (Nonparametric Method) 781

17.4 Randomized Complete Block (RCB) Designs 785

17.4.1 The Friedman Fr-Test for Randomized Complete Block Design (Nonparametric Method) 792

17.4.2 Experiments with One Missing Observation in an RCB-Design Experiment 794

17.4.3 Experiments with Several Missing Observations in an RCB-Design Experiment 795

17.5 Two-Way Experimental Layouts 798

17.5.1 Two-Way Experimental Layouts with One Observation per Cell 800

17.5.2 Two-Way Experimental Layouts with r > 1 Observations per Cell 801

17.5.3 Blocking in Two-Way Experimental Layouts 810

17.5.4 Extending Two-Way Experimental Designs to n-Way Experimental Layouts 811

17.6 Latin Square Designs 813

17.7 Random-Effects and Mixed-Effects Models 820

17.7.1 Random-Effects Model 820

17.7.2 Mixed-Effects Model 822

17.7.3 Nested (Hierarchical) Designs 824

17.8 A Case Study 831

17.9 Using JMP 832

Review Practice Problems 832

18 The 2k Factorial Designs 847

18.1 Introduction 848

18.2 The Factorial Designs 848

18.3 The 2k Factorial Designs 850

18.4 Unreplicated 2k Factorial Designs 859

18.5 Blocking in the 2k Factorial Design 867

18.5.1 Confounding in the 2k Factorial Design 867

18.5.2 Yates’s Algorithm for the 2k Factorial Designs 875

18.6 The 2k Fractional Factorial Designs 877

18.6.1 One-half Replicate of a 2k Factorial Design 877

18.6.2 One-quarter Replicate of a 2k Factorial Design 882

18.7 Case Studies 887

18.8 Using JMP 889

Review Practice Problems 889

19 Response Surfaces 897

19.1 Introduction 897

19.1.1 Basic Concepts of Response Surface Methodology 898

19.2 First-Order Designs 903

19.3 Second-Order Designs 917

19.3.1 Central Composite Designs (CCDs) 918

19.3.2 Some Other First-Order and Second-Order Designs 928

19.4 Determination of Optimum or Near-Optimum Point 936

19.4.1 The Method of Steepest Ascent 937

19.4.2 Analysis of a Fitted Second-Order Response Surface 941

19.5 Anova Table for a Second-Order Model 946

19.6 Case Studies 948

19.7 Using JMP 950

Review Practice Problems 950

20 Statistical Quality Control - Phase I Control Charts 958

21 Statistical Quality Control - Phase II Control Charts 960

Appendices 961

Appendix A Statistical Tables 962

Appendix B Answers to Selected Problems 969

Appendix C Bibliography 992

Index 1003

Authors

Bhisham C. Gupta is Professor in the Department of Mathematics and Statistics at the University of Southern Maine. Irwin Guttman University of Toronto. Kalanka P. Jayalath

Table of Contents

Authors

Related Topics

Related Products

Introductory Statistics, International Adaptation. Edition No. 10

Probability and Stochastic Processes. A Friendly Introduction for Electrical and Computer Engineers, International Adaptation. Edition No. 4

Applying Computational Intelligence for Social Good. Track, Understand and Build a Better world. Advances in Computers Volume 132

Research & Reviews: Discrete Mathematical Structures

Fuzzy Mathematics, Graphs, and Similarity Measures. Analysis and Application Across Global Challenges