+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

Probability with R. An Introduction with Computer Science Applications. Edition No. 2

  • Book

  • 496 Pages
  • February 2020
  • John Wiley and Sons Ltd
  • ID: 5841945

Provides a comprehensive introduction to probability with an emphasis on computing-related applications

This self-contained new and extended edition outlines a first course in probability applied to computer-related disciplines. As in the first edition, experimentation and simulation are favoured over mathematical proofs. The freely down-loadable statistical programming language is used throughout the text, not only as a tool for calculation and data analysis, but also to illustrate concepts of probability and to simulate distributions. The examples in Probability with R: An Introduction with Computer Science Applications, Second Edition cover a wide range of computer science applications, including: testing program performance; measuring response time and CPU time; estimating the reliability of components and systems; evaluating algorithms and queuing systems. 

Chapters cover: The R language; summarizing statistical data; graphical displays; the fundamentals of probability; reliability; discrete and continuous distributions; and more. 

This second edition includes:

  • improved R code throughout the text, as well as new procedures, packages and interfaces;
  • updated and additional examples, exercises and projects covering recent developments of computing;
  • an introduction to bivariate discrete distributions together with the R functions used to handle large matrices of conditional probabilities, which are often needed in machine translation;
  • an introduction to linear regression with particular emphasis on its application to machine learning using testing and training data;
  • a new section on spam filtering using Bayes theorem to develop the filters;
  • an extended range of Poisson applications such as network failures, website hits, virus attacks and accessing the cloud;
  • use of new allocation functions in R to deal with hash table collision, server overload and the general allocation problem.

The book is supplemented with a Wiley Book Companion Site featuring data and solutions to exercises within the book.

Primarily addressed to students of computer science and related areas, Probability with R: An Introduction with Computer Science Applications, Second Edition is also an excellent text for students of engineering and the general sciences. Computing professionals who need to understand the relevance of probability in their areas of practice will find it useful.

Table of Contents

Preface to the Second Edition xiii

Preface to the First Edition xvii

Acknowledgments xxi

About the Companion Website xxiii

I The R Language 1

1 Basics of R 3

1.1 What is R? 3

1.2 Installing R 4

1.3 R Documentation 4

1.4 Basics 5

1.5 Getting Help 6

1.6 Data Entry 7

1.7 Missing Values 11

1.8 Editing 12

1.9 Tidying Up 12

1.10 Saving and Retrieving 13

1.11 Packages 13

1.12 Interfaces 14

1.13 Project 16

2 Summarizing Statistical Data 17

2.1 Measures of Central Tendency 17

2.2 Measures of Dispersion 21

2.3 Overall Summary Statistics 24

2.4 Programming in R 25

2.5 Project 30

3 Graphical Displays 31

3.1 Boxplots 31

3.2 Histograms 36

3.3 Stem and Leaf 40

3.4 Scatter Plots 40

3.5 The Line of Best Fit 43

3.6 Machine Learning and the Line of Best Fit 44

3.7 Graphical Displays Versus Summary Statistics 49

3.8 Projects 53

II Fundamentals of Probability 55

4 Probability Basics 57

4.1 Experiments, Sample Spaces, and Events 58

4.2 Classical Approach to Probability 61

4.3 Permutations and Combinations 64

4.4 The Birthday Problem 71

4.5 Balls and Bins 76

4.6 R Functions for Allocation 79

4.7 Allocation Overload 81

4.8 Relative Frequency Approach to Probability 83

4.9 Simulating Probabilities 84

4.10 Projects 89

5 Rules of Probability 91

5.1 Probability and Sets 91

5.2 Mutually Exclusive Events 92

5.3 Complementary Events 93

5.4 Axioms of Probability 94

5.5 Properties of Probability 96

6 Conditional Probability 104

6.1 Multiplication Law of Probability 107

6.2 Independent Events 108

6.3 Independence of More than Two Events 110

6.4 The Intel Fiasco 113

6.5 Law of Total Probability 115

6.6 Trees 118

6.7 Project 123

7 Posterior Probability and Bayes 124

7.1 Bayes’ Rule 124

7.2 Hardware Fault Diagnosis 131

7.3 Machine Learning and Classification 132

7.4 Spam Filtering 135

7.5 Machine Translation 137

8 Reliability 142

8.1 Series Systems 142

8.2 Parallel Systems 143

8.3 Reliability of a System 143

8.4 Series-Parallel Systems 150

8.5 The Design of Systems 153

8.6 The General System 158

III Discrete Distributions 161

9 Introduction to Discrete Distributions 163

9.1 Discrete Random Variables 163

9.2 Cumulative Distribution Function 168

9.3 Some Simple Discrete Distributions 170

9.4 Benford’s Law 174

9.5 Summarizing Random Variables: Expectation 175

9.6 Properties of Expectations 180

9.7 Simulating Discrete Random Variables and Expectations 183

9.8 Bivariate Distributions 187

9.9 Marginal Distributions 189

9.10 Conditional Distributions 190

9.11 Project 194

10 The Geometric Distribution 196

10.1 Geometric Random Variables 198

10.2 Cumulative Distribution Function 203

10.3 The Quantile Function 207

10.4 Geometric Expectations 209

10.5 Simulating Geometric Probabilities and Expectations 210

10.6 Amnesia 217

10.7 Simulating Markov 219

10.8 Projects 224

11 The Binomial Distribution 226

11.1 Binomial Probabilities 227

11.2 Binomial Random Variables 229

11.3 Cumulative Distribution Function 233

11.4 The Quantile Function 235

11.5 Reliability: The General System 238

11.6 Machine Learning 241

11.7 Binomial Expectations 245

11.8 Simulating Binomial Probabilities and Expectations 248

11.9 Projects 254

12 The Hypergeometric Distribution 255

12.1 Hypergeometric Random Variables 257

12.2 Cumulative Distribution Function 260

12.3 The Lottery 262

12.4 Hypergeometric or Binomial? 266

12.5 Projects 273

13 The Poisson Distribution 274

13.1 Death by Horse Kick 274

13.2 Limiting Binomial Distribution 275

13.3 Random Events in Time and Space 281

13.4 Probability Density Function 283

13.5 Cumulative Distribution Function 287

13.6 The Quantile Function 289

13.7 Estimating Software Reliability 290

13.8 Modeling Defects in Integrated Circuits 292

13.9 Simulating Poisson Probabilities 293

13.10 Projects 298

14 Sampling Inspection Schemes 299

14.1 Introduction 299

14.2 Single Sampling Inspection Schemes 300

14.3 Acceptance Probabilities 301

14.4 Simulating Sampling Inspection Schemes 303

14.5 Operating Characteristic Curve 308

14.6 Producer’s and Consumer’s Risks 310

14.7 Design of Sampling Schemes 311

14.8 Rectifying Sampling Inspection Schemes 315

14.9 Average Outgoing Quality 316

14.10 Double Sampling Inspection Schemes 318

14.11 Average Sample Size 319

14.12 Single Versus Double Schemes 320

14.13 Projects 324

IV Continuous Distributions 325

15 Introduction to Continuous Distributions 327

15.1 Introduction to Continuous Random Variables 328

15.2 Probability Density Function 328

15.3 Cumulative Distribution Function 331

15.4 The Uniform Distribution 332

15.5 Expectation of a Continuous Random Variable 336

15.6 Simulating Continuous Variables 338

16 The Exponential Distribution 341

16.1 Modeling Waiting Times 341

16.2 Probability Density Function of Waiting Times 342

16.3 Cumulative Distribution Function 344

16.4 Modeling Lifetimes 347

16.5 Quantiles 349

16.6 Exponential Expectations 351

16.7 Simulating Exponential Probabilities and Expectations 353

16.8 Amnesia 356

16.9 Simulating Markov 360

16.10 Project 369

17 Queues 370

17.1 The Single Server Queue 370

17.2 Traffic Intensity 371

17.3 Queue Length 372

17.4 Average Response Time 376

17.5 Extensions of the M/M/1 Queue 378

17.6 Project 382

18 The Normal Distribution 383

18.1 The Normal Probability Density Function 385

18.2 The Cumulative Distribution Function 387

18.3 Quantiles 389

18.4 The Standard Normal Distribution 391

18.5 Achieving Normality: Limiting Distributions 394

18.6 Projects 405

19 Process Control 407

19.1 Control Charts 407

19.2 Cusum Charts 411

19.3 Charts for Defective Rates 412

19.4 Project 416

V Tailing Off 417

20 The Inequalities of Markov and Chebyshev 419

20.1 Markov’s Inequality 420

20.2 Algorithm Runtime 426

20.3 Chebyshev’s Inequality 427

Appendix A: Data: Examination Results 433

Appendix B: The Line of Best Fit: Coefficient Derivations 437

Appendix C: Variance Derivations 440

Appendix D: Binomial Approximation to the Hypergeometric 446

Appendix E: Normal Tables 448

Appendix F: The Inequalities of Markov and Chebyshev 450

Index to R Commands 453

Index 457

Postface

Authors

Jane M. Horgan Dublin City University, Ireland.