+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

Deep Learning Approaches for Security Threats in IoT Environments. Edition No. 1

  • Book

  • 384 Pages
  • November 2022
  • John Wiley and Sons Ltd
  • ID: 5841891
Deep Learning Approaches for Security Threats in IoT Environments

An expert discussion of the application of deep learning methods in the IoT security environment

In Deep Learning Approaches for Security Threats in IoT Environments, a team of distinguished cybersecurity educators deliver an insightful and robust exploration of how to approach and measure the security of Internet-of-Things (IoT) systems and networks. In this book, readers will examine critical concepts in artificial intelligence (AI) and IoT, and apply effective strategies to help secure and protect IoT networks. The authors discuss supervised, semi-supervised, and unsupervised deep learning techniques, as well as reinforcement and federated learning methods for privacy preservation.

This book applies deep learning approaches to IoT networks and solves the security problems that professionals frequently encounter when working in the field of IoT, as well as providing ways in which smart devices can solve cybersecurity issues.

Readers will also get access to a companion website with PowerPoint presentations, links to supporting videos, and additional resources. They’ll also find: - A thorough introduction to artificial intelligence and the Internet of Things, including key concepts like deep learning, security, and privacy - Comprehensive discussions of the architectures, protocols, and standards that form the foundation of deep learning for securing modern IoT systems and networks - In-depth examinations of the architectural design of cloud, fog, and edge computing networks - Fulsome presentations of the security requirements, threats, and countermeasures relevant to IoT networks

Perfect for professionals working in the AI, cybersecurity, and IoT industries, Deep Learning Approaches for Security Threats in IoT Environments will also earn a place in the libraries of undergraduate and graduate students studying deep learning, cybersecurity, privacy preservation, and the security of IoT networks.

Table of Contents

About the Authors xv

1 Introducing Deep Learning for IoT Security 1

1.1 Introduction 1

1.2 Internet of Things (IoT) Architecture 1

1.2.1 Physical Layer 3

1.2.2 Network Layer 4

1.2.3 Application Layer 5

1.3 Internet of Things’ Vulnerabilities and Attacks 6

1.3.1 Passive Attacks 6

1.3.2 Active Attacks 7

1.4 Artificial Intelligence 11

1.5 Deep Learning 14

1.6 Taxonomy of Deep Learning Models 15

1.6.1 Supervision Criterion 15

1.6.1.1 Supervised Deep Learning 15

1.6.1.2 Unsupervised Deep Learning 17

1.6.1.3 Semi-Supervised Deep Learning 18

1.6.1.4 Deep Reinforcement Learning 19

1.6.2 Incrementality Criterion 19

1.6.2.1 Batch Learning 20

1.6.2.2 Online Learning 21

1.6.3 Generalization Criterion 21

1.6.3.1 Model-Based Learning 22

1.6.3.2 Instance-Based Learning 22

1.6.4 Centralization Criterion 22

1.7 Supplementary Materials 25

References 25

2 Deep Neural Networks 27

2.1 Introduction 27

2.2 From Biological Neurons to Artificial Neurons 28

2.2.1 Biological Neurons 28

2.2.2 Artificial Neurons 30

2.3 Artificial Neural Network 31

2.3.1 Input Layer 34

2.3.2 Hidden Layer 34

2.3.3 Output Layer 34

2.4 Activation Functions 35

2.4.1 Types of Activation 35

2.4.1.1 Binary Step Function 35

2.4.1.2 Linear Activation Function 36

2.4.1.3 Nonlinear Activation Functions 36

2.5 The Learning Process of ANN 40

2.5.1 Forward Propagation 41

2.5.2 Backpropagation (Gradient Descent) 42

2.6 Loss Functions 49

2.6.1 Regression Loss Functions 49

2.6.1.1 Mean Absolute Error (MAE) Loss 50

2.6.1.2 Mean Squared Error (MSE) Loss 50

2.6.1.3 Huber Loss 50

2.6.1.4 Mean Bias Error (MBE) Loss 51

2.6.1.5 Mean Squared Logarithmic Error (MSLE) 51

2.6.2 Classification Loss Functions 52

2.6.2.1 Binary Cross Entropy (BCE) Loss 52

2.6.2.2 Categorical Cross Entropy (CCE) Loss 52

2.6.2.3 Hinge Loss 53

2.6.2.4 Kullback-Leibler Divergence (KL) Loss 53

2.7 Supplementary Materials 53

References 54

3 Training Deep Neural Networks 55

3.1 Introduction 55

3.2 Gradient Descent Revisited 56

3.2.1 Gradient Descent 56

3.2.2 Stochastic Gradient Descent 57

3.2.3 Mini-batch Gradient Descent 59

3.3 Gradient Vanishing and Explosion 60

3.4 Gradient Clipping 61

3.5 Parameter Initialization 62

3.5.1 Zero Initialization 62

3.5.2 Random Initialization 63

3.5.3 Lecun Initialization 65

3.5.4 Xavier Initialization 65

3.5.5 Kaiming (He) Initialization 66

3.6 Faster Optimizers 67

3.6.1 Momentum Optimization 67

3.6.2 Nesterov Accelerated Gradient 69

3.6.3 AdaGrad 69

3.6.4 RMSProp 70

3.6.5 Adam Optimizer 70

3.7 Model Training Issues 71

3.7.1 Bias 72

3.7.2 Variance 72

3.7.3 Overfitting Issues 72

3.7.4 Underfitting Issues 73

3.7.5 Model Capacity 74

3.8 Supplementary Materials 74

References 75

4 Evaluating Deep Neural Networks 77

4.1 Introduction 77

4.2 Validation Dataset 78

4.3 Regularization Methods 79

4.3.1 Early Stopping 79

4.3.2 L1 and L2 Regularization 80

4.3.3 Dropout 81

4.3.4 Max-Norm Regularization 82

4.3.5 Data Augmentation 82

4.4 Cross-Validation 83

4.4.1 Hold-Out Cross-Validation 84

4.4.2 k-Folds Cross-Validation 85

4.4.3 Stratified k-Folds’ Cross-Validation 86

4.4.4 Repeated k-Folds’ Cross-Validation 87

4.4.5 Leave-One-Out Cross-Validation 88

4.4.6 Leave-p-Out Cross-Validation 89

4.4.7 Time Series Cross-Validation 90

4.4.8 Rolling Cross-Validation 90

4.4.9 Block Cross-Validation 90

4.5 Performance Metrics 92

4.5.1 Regression Metrics 92

4.5.1.1 Mean Absolute Error (MAE) 92

4.5.1.2 Root Mean Squared Error (RMSE) 93

4.5.1.3 Coefficient of Determination (R2) 93

4.5.1.4 Adjusted R2 94

4.5.2 Classification Metrics 94

4.5.2.1 Confusion Matrix 94

4.5.2.2 Accuracy 96

4.5.2.3 Precision 96

4.5.2.4 Recall 97

4.5.2.5 Precision-Recall Curve 97

4.5.2.6 F1-Score 97

4.5.2.7 Beta F1 Score 98

4.5.2.8 False Positive Rate (FPR) 98

4.5.2.9 Specificity 99

4.5.2.10 Receiving Operating Characteristics (ROC) Curve 99

4.6 Supplementary Materials 99

References 100

5 Convolutional Neural Networks 103

5.1 Introduction 103

5.2 Shift from Full Connected to Convolutional 104

5.3 Basic Architecture 106

5.3.1 The Cross-Correlation Operation 106

5.3.2 Convolution Operation 107

5.3.3 Receptive Field 108

5.3.4 Padding and Stride 109

5.3.4.1 Padding 109

5.3.4.2 Stride 111

5.4 Multiple Channels 113

5.4.1 Multi-Channel Inputs 113

5.4.2 Multi-Channel Output 114

5.4.3 Convolutional Kernel 1 × 1 115

5.5 Pooling Layers 116

5.5.1 Max Pooling 117

5.5.2 Average Pooling 117

5.6 Normalization Layers 119

5.6.1 Batch Normalization 119

5.6.2 Layer Normalization 122

5.6.3 Instance Normalization 124

5.6.4 Group Normalization 126

5.6.5 Weight Normalization 126

5.7 Convolutional Neural Networks (LeNet) 127

5.8 Case Studies 129

5.8.1 Handwritten Digit Classification (One Channel Input) 129

5.8.2 Dog vs. Cat Image Classification (Multi-Channel Input) 130

5.9 Supplementary Materials 130

References 130

6 Dive Into Convolutional Neural Networks 133

6.1 Introduction 133

6.2 One-Dimensional Convolutional Network 134

6.2.1 One-Dimensional Convolution 134

6.2.2 One-Dimensional Pooling 135

6.3 Three-Dimensional Convolutional Network 136

6.3.1 Three-Dimensional Convolution 136

6.3.2 Three-Dimensional Pooling 136

6.4 Transposed Convolution Layer 137

6.5 Atrous/Dilated Convolution 144

6.6 Separable Convolutions 145

6.6.1 Spatially Separable Convolutions 146

6.6.2 Depth-wise Separable (DS) Convolutions 148

6.7 Grouped Convolution 150

6.8 Shuffled Grouped Convolution 152

6.9 Supplementary Materials 154

References 154

7 Advanced Convolutional Neural Network 157

7.1 Introduction 157

7.2 AlexNet 158

7.3 Block-wise Convolutional Network (VGG) 159

7.4 Network in Network 160

7.5 Inception Networks 162

7.5.1 GoogLeNet 163

7.5.2 Inception Network v2 (Inception v2) 166

7.5.3 Inception Network v3 (Inception v3) 170

7.6 Residual Convolutional Networks 170

7.7 Dense Convolutional Networks 173

7.8 Temporal Convolutional Network 176

7.8.1 One-Dimensional Convolutional Network 177

7.8.2 Causal and Dilated Convolution 180

7.8.3 Residual Blocks 185

7.9 Supplementary Materials 188

References 188

8 Introducing Recurrent Neural Networks 189

8.1 Introduction 189

8.2 Recurrent Neural Networks 190

8.2.1 Recurrent Neurons 190

8.2.2 Memory Cell 192

8.2.3 Recurrent Neural Network 193

8.3 Different Categories of RNNs 194

8.3.1 One-to-One RNN 195

8.3.2 One-to-Many RNN 195

8.3.3 Many-to-One RNN 196

8.3.4 Many-to-Many RNN 197

8.4 Backpropagation Through Time 198

8.5 Challenges Facing Simple RNNs 202

8.5.1 Vanishing Gradient 202

8.5.2 Exploding Gradient 204

8.5.2.1 Truncated Backpropagation Through Time (TBPTT) 204

8.5.2.2 Penalty on the Recurrent Weights Whh205

8.5.2.3 Clipping Gradients 205

8.6 Case Study: Malware Detection 205

8.7 Supplementary Material 206

References 207

9 Dive Into Recurrent Neural Networks 209

9.1 Introduction 209

9.2 Long Short-Term Memory (LSTM) 210

9.2.1 LSTM Gates 211

9.2.2 Candidate Memory Cells 213

9.2.3 Memory Cell 214

9.2.4 Hidden State 216

9.3 LSTM with Peephole Connections 217

9.4 Gated Recurrent Units (GRU) 218

9.4.1 CRU Cell Gates 218

9.4.2 Candidate State 220

9.4.3 Hidden State 221

9.5 ConvLSTM 222

9.6 Unidirectional vs. Bidirectional Recurrent Network 223

9.7 Deep Recurrent Network 226

9.8 Insights 227

9.9 Case Study of Malware Detection 228

9.10 Supplementary Materials 229

References 229

10 Attention Neural Networks 231

10.1 Introduction 231

10.2 From Biological to Computerized Attention 232

10.2.1 Biological Attention 232

10.2.2 Queries, Keys, and Values 234

10.3 Attention Pooling: Nadaraya-Watson Kernel Regression 235

10.4 Attention-Scoring Functions 237

10.4.1 Masked Softmax Operation 239

10.4.2 Additive Attention (AA) 239

10.4.3 Scaled Dot-Product Attention 240

10.5 Multi-Head Attention (MHA) 240

10.6 Self-Attention Mechanism 242

10.6.1 Self-Attention (SA) Mechanism 242

10.6.2 Positional Encoding 244

10.7 Transformer Network 244

10.8 Supplementary Materials 247

References 247

11 Autoencoder Networks 249

11.1 Introduction 249

11.2 Introducing Autoencoders 250

11.2.1 Definition of Autoencoder 250

11.2.2 Structural Design 253

11.3 Convolutional Autoencoder 256

11.4 Denoising Autoencoder 258

11.5 Sparse Autoencoders 260

11.6 Contractive Autoencoders 262

11.7 Variational Autoencoders 263

11.8 Case Study 268

11.9 Supplementary Materials 269

References 269

12 Generative Adversarial Networks (GANs) 271

12.1 Introduction 271

12.2 Foundation of Generative Adversarial Network 272

12.3 Deep Convolutional GAN 279

12.4 Conditional GAN 281

12.5 Supplementary Materials 285

References 285

13 Dive Into Generative Adversarial Networks 287

13.1 Introduction 287

13.2 Wasserstein GAN 288

13.2.1 Distance Functions 289

13.2.2 Distance Function in GANs 291

13.2.3 Wasserstein Loss 293

13.3 Least-Squares GAN (LSGAN) 298

13.4 Auxiliary Classifier GAN (ACGAN) 300

13.5 Supplementary Materials 301

References 301

14 Disentangled Representation GANs 303

14.1 Introduction 303

14.2 Disentangled Representations 304

14.3 InfoGAN 306

14.4 StackedGAN 309

14.5 Supplementary Materials 316

References 316

15 Introducing Federated Learning for Internet of Things (IoT) 317

15.1 Introduction 317

15.2 Federated Learning in the Internet of Things 319

15.3 Taxonomic View of Federated Learning 322

15.3.1 Network Structure 322

15.3.1.1 Centralized Federated Learning 322

15.3.1.2 Decentralized Federated Learning 323

15.3.1.3 Hierarchical Federated Learning 324

15.3.2 Data Partition 325

15.3.3 Horizontal Federated Learning 326

15.3.4 Vertical Federated Learning 327

15.3.5 Federated Transfer Learning 328

15.4 Open-Source Frameworks 330

15.4.1 TensorFlow Federated 330

15.4.2 PySyft and PyGrid 331

15.4.3 FedML 331

15.4.4 LEAF 332

15.4.5 PaddleFL 332

15.4.6 Federated AI Technology Enabler (FATE) 333

15.4.7 OpenFL 333

15.4.8 IBM Federated Learning 333

15.4.9 NVIDIA Federated Learning Application Runtime Environment (NVIDIA FLARE) 334

15.4.10 Flower 334

15.4.11 Sherpa.ai 335

15.5 Supplementary Materials 335

References 335

16 Privacy-Preserved Federated Learning 337

16.1 Introduction 337

16.2 Statistical Challenges in Federated Learning 338

16.2.1 Nonindependent and Identically Distributed (Non-IID) Data 338

16.2.1.1 Class Imbalance 338

16.2.1.2 Distribution Imbalance 341

16.2.1.3 Size Imbalance 346

16.2.2 Model Heterogeneity 346

16.2.2.1 Extracting the Essence of a Subject 346

16.2.3 Block Cycles 348

16.3 Security Challenge in Federated Learning 348

16.3.1 Untargeted Attacks 349

16.3.2 Targeted Attacks 349

16.4 Privacy Challenges in Federated Learning 350

16.4.1 Secure Aggregation 351

16.4.1.1 Homomorphic Encryption (HE) 351

16.4.1.2 Secure Multiparty Computation 352

16.4.1.3 Blockchain 352

16.4.2 Perturbation Method 353

16.5 Supplementary Materials 355

References 355

Index 357

Authors

Mohamed Abdel-Basset Zagazig University, Egypt. Nour Moustafa University of New South Wales, UNSW Canberra, Australia. Hossam Hawash Zagazig University, Egypt.