+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

An Introduction to Correspondence Analysis. Edition No. 1. Wiley Series in Probability and Statistics

  • Book

  • 240 Pages
  • April 2021
  • John Wiley and Sons Ltd
  • ID: 5839688

Master the fundamentals of correspondence analysis with this illuminating resource

An Introduction to Correspondence Analysis assists researchers in improving their familiarity with the concepts, terminology, and application of several variants of correspondence analysis. The accomplished academics and authors deliver a comprehensive and insightful treatment of the fundamentals of correspondence analysis, including the statistical and visual aspects of the subject.

Written in three parts, the book begins by offering readers a description of two variants of correspondence analysis that can be applied to two-way contingency tables for nominal categories of variables. Part Two shifts the discussion to categories of ordinal variables and demonstrates how the ordered structure of these variables can be incorporated into a correspondence analysis. Part Three describes the analysis of multiple nominal categorical variables, including both multiple correspondence analysis and multi-way correspondence analysis.

Readers will benefit from explanations of a wide variety of specific topics, for example:

  • Simple correspondence analysis, including how to reduce multidimensional space, measuring symmetric associations with the Pearson Ratio, constructing low-dimensional displays, and detecting statistically significant points
  • Non-symmetrical correspondence analysis, including quantifying asymmetric associations
  • Simple ordinal correspondence analysis, including how to decompose the Pearson Residual for ordinal variables
  • Multiple correspondence analysis, including crisp coding and the indicator matrix, the Burt Matrix, and stacking
  • Multi-way correspondence analysis, including symmetric multi-way analysis

Perfect for researchers who seek to improve their understanding of key concepts in the graphical analysis of categorical data, An Introduction to Correspondence Analysis will also assist readers already familiar with correspondence analysis who wish to review the theoretical and foundational underpinnings of crucial concepts.

Table of Contents

Preface xiii

1 Introduction 1

1.1 Data Visualisation 1

1.2 Correspondence Analysis in a “Nutshell” 3

1.3 Data Sets 4

1.3.1 Traditional European Food Data 4

1.3.2 Temperature Data 6

1.3.3 Shoplifting Data 6

1.3.4 Alligator Data 7

1.4 Symmetrical Versus Asymmetrical Association 8

1.5 Notation 10

1.5.1 The Two-way Contingency Table 10

1.5.2 The Three-way Contingency Table 11

1.6 Formal Test of Symmetrical Association 12

1.6.1 Test of Independence for Two-way Contingency Tables 12

1.6.2 The Chi-squared Statistic for a Two-way Table 13

1.6.3 Analysis of the Traditional European Food Data 13

1.6.4 The Chi-squared Statistic for a Three-way Table 15

1.6.5 Analysis of the Alligator Data 16

1.7 Formal Test of Asymmetrical Association 17

1.7.1 Test of Predictability for Two-way Contingency Tables 17

1.7.2 The Goodman-Kruskal tau Index 17

1.7.3 Analysis of the Traditional European Food Data 18

1.7.4 Test of Predictability for Three-way Contingency Tables 19

1.7.5 Marcotorchino’s Index 19

1.7.6 Analysis of the Alligator Data 20

1.7.7 The Gray-Williams Index and Delta Index 21

1.8 Correspondence Analysis and R 22

1.9 Overview of the Book 25

Part I Classical Analysis of Two Categorical Variables 29

2 Simple Correspondence Analysis 31

2.1 Introduction 31

2.2 Reducing Multi-dimensional Space 32

2.2.1 Profiles Cloud of Points 32

2.2.2 Profiles for the Traditional European Food Data 33

2.2.3 Weighted Centred Profiles 33

2.3 Measuring Symmetric Association 39

2.3.1 The Pearson Ratio 39

2.3.2 Analysis of the Traditional European Food Data 40

2.4 Decomposing the Pearson Residual for Nominal Variables 41

2.4.1 The Generalised SVD of 𝛾ij - 1 41

2.4.2 SVD of the Pearson Ratio’s 44

2.4.3 GSVD and the Traditional European Food Data 44

2.5 Constructing a Low-Dimensional Display 46

2.5.1 Standard Coordinates 46

2.5.2 Principal Coordinates 47

2.6 Practicalities of the Low-Dimensional Plot 50

2.6.1 The Two-Dimensional Correspondence Plot 50

2.6.2 What is NOT Being Shown in a Two-Dimensional Correspondence Plot? 54

2.6.3 The Three-Dimensional Correspondence Plot 57

2.7 The Biplot Display 59

2.7.1 Definition 59

2.7.2 Isometric Biplots of the Traditional European Food Data 60

2.7.3 What is NOT Being Shown in a Two-Dimensional Biplot? 63

2.8 The Case for No Visual Display 63

2.9 Detecting Statistically Significant Points 64

2.9.1 Confidence Circles and Ellipses 64

2.9.2 Confidence Ellipses for the Traditional European Food Data 65

2.10 Approximate p-values 69

2.10.1 The Hypothesis Test and its p-value 69

2.10.2 P-values and the Traditional European Food Data 70

2.11 Final Comments 70

3 Non-Symmetrical Correspondence Analysis 71

3.1 Introduction 71

3.2 Quantifying Asymmetric Association 72

3.2.1 The Goodman-Kruskal tau Index 72

3.2.2 The 𝜏 Index and the Traditional European Food Data 72

3.2.3 Weighted Centred Column Profile 73

3.2.4 Profiles of the Traditional European Food Data 73

3.3 Decomposing 𝜋i|j for Nominal Variables 76

3.3.1 The Generalised SVD of 𝜋i|j 76

3.3.2 GSVD and the Traditional Food Data 77

3.4 Constructing a Low-Dimensional Display 79

3.4.1 Standard Coordinates 79

3.4.2 Principal Coordinates 79

3.5 Practicalities of the Low-Dimensional Plot 82

3.5.1 The Two-Dimensional Correspondence Plot 82

3.5.2 The Three-Dimensional Correspondence Plot 85

3.6 The Biplot Display 89

3.6.1 Definition 89

3.6.2 The Column Isometric Biplot for the Traditional Food Data 90

3.6.3 The Three-Dimensional Biplot 91

3.7 Detecting Statistically Significant Points 92

3.7.1 Confidence Circles and Ellipses 92

3.7.2 Confidence Ellipses for the Traditional Food Data 93

3.8 Final Comments 96

Part II Ordinal Analysis of Two Categorical Variables 99

4 Simple Ordinal Correspondence Analysis 101

4.1 Introduction 101

4.2 A Simple Correspondence Analysis of the Temperature Data 102

4.3 On the Mean and Variation of Profiles with Ordered Categories 104

4.3.1 Profiles of the Temperature Data 104

4.3.2 Defining Scores 105

4.3.3 On the Mean of the Profiles 107

4.3.4 On the Variation of the Profiles 108

4.3.5 Mean and Variation of Profiles for the Temperature Data 108

4.4 Decomposing the Pearson Residual for Ordinal Variables 111

4.4.1 The Bivariate Moment Decomposition of 𝛾ij - 1 111

4.4.2 BMD and the Temperature Data 113

4.5 Constructing a Low-Dimensional Display 115

4.5.1 Standard Coordinates 115

4.5.2 Principal Coordinates 116

4.5.3 Practicalities of the Ordered Principal Coordinates 119

4.6 The Biplot Display 120

4.6.1 Definition 120

4.6.2 Ordered Column Isometric Biplot 120

4.6.3 Ordered Row Isometric Biplot 120

4.6.4 Ordered Isometric Biplots for the Temperature Data 121

4.7 Final Comments 124

5 Ordered Non-symmetrical Correspondence Analysis 125

5.1 Introduction 125

5.2 The Goodman-Kruskal tau Index Revisited 126

5.3 Decomposing 𝜋i|j for Ordinal and Nominal Variables 128

5.3.1 The Hybrid Decomposition of 𝜋i|j 128

5.3.2 Hybrid Decomposition and the Shoplifting Data 131

5.4 Constructing a Low-Dimensional Display 133

5.4.1 Standard Coordinates 133

5.4.2 Principal Coordinates 134

5.5 The Biplot 135

5.5.1 An Overview 135

5.5.2 Column Isometric Biplot 135

5.5.3 Column Isometric Biplot of the Shoplifting Data 135

5.5.4 Row Isometric Biplot 137

5.5.5 Row Isometric Biplot of the Shoplifting Data 137

5.5.6 Distance Measures and the Row Isometric Biplots 140

5.6 Some FinalWords 141

Part III Analysis of Multiple Categorical Variables 143

6 Multiple Correspondence Analysis 145

6.1 Introduction 145

6.2 Crisp Coding and the Indicator Matrix 146

6.2.1 Crisp Coding 146

6.2.2 The Indicator Matrix 146

6.2.3 Crisp Coding and the Alligator Data 147

6.2.4 Application of Multiple Correspondence Analysis using the Indicator Matrix 148

6.3 The Burt Matrix 152

6.4 Stacking 156

6.4.1 A Definition 156

6.4.2 Stacking and the Alligator Data - Lake(SizeFood 156

6.4.3 Stacking and the Alligator Data - Food(SizeLake 159

6.5 Final Comments 161

7 Multi-way Correspondence Analysis 163

7.1 An Introduction 163

7.2 Pearson’s Residual 𝛾ijk - 1 and the Partition of X2 164

7.2.1 The Pearson Residual 164

7.2.2 The Partition of X2 165

7.2.3 Partition of X2 for theAlligator Data 165

7.3 Symmetric Multi-way Correspondence Analysis 167

7.3.1 Tucker3 Decomposition of 𝛾ijk - 1 167

7.3.2 T3D and the Analysis of Two Variables 170

7.3.3 On the Choice of the Number of Components 171

7.3.4 Tucker3 Decomposition of 𝛾ijk - 1 and the Alligator Data 171

7.4 Constructing a Low-Dimensional Display 175

7.4.1 Principal Coordinates 175

7.4.2 The Interactive Biplot 176

7.4.3 Column-Tube Interactive Biplot for the Alligator Data 181

7.4.4 Row Interactive Biplot for the Alligator Data 185

7.5 The Marcotorchino Residual 𝜋i|j,k and the Partition of 𝜏M 188

7.5.1 The Marcotrochino Residual 188

7.5.2 The Partition of 𝜏M 189

7.5.3 Partition of 𝜏M for the Alligator Data 190

7.6 Non-symmetrical Multi-way Correspondence Analysis 191

7.6.1 Tucker3 Decomposition of 𝜋i|j,k 191

7.6.2 Tucker3 Decomposition of 𝜋i|j,k and the Alligator Data 193

7.7 Constructing a Low-Dimensional Display 194

7.7.1 On the Choice of Coordinates 194

7.7.2 Column-Tube Interactive Biplot for the Alligator Data 195

7.8 Final Comments 199

References 201

Author Index 213

Subject Index 217

Authors

Eric J. Beh University of Newcastle, Australia. Rosaria Lombardo University of Campania Luigi Vanvitelli, Italy.