There is a growing interest in the application of machine learning algorithms in chemical safety and health-related model development, with applications in areas including property and toxicity prediction, consequence prediction, and fault detection. This book is the first to review the current status of machine learning implementation in chemical safety and health research and to provide guidance for implementing machine learning techniques and algorithms into chemical safety and health research.
Written by an international team of authors and edited by renowned experts in the areas of process safety and occupational and environmental health, sample topics covered within the work include: - An introduction to the fundamentals of machine learning, including regression, classification and cross-validation, and an overview of software and tools- Detailed reviews of various applications in the areas of chemical safety and health, including flammability prediction, consequence prediction, asset integrity management, predictive nanotoxicity and environmental exposure assessment, and more- Perspective on the possible future development of this field
Machine Learning in Chemical Safety and Health serves as an essential guide on both the fundamentals and applications of machine learning for industry professionals and researchers in the fields of process safety, chemical safety, occupational and environmental health, and industrial hygiene.
Table of Contents
List of Contributors xiii
Preface xvii
1 Introduction 1
Pingfan Hu and Qingsheng Wang
1.1 Background 2
1.2 Current State 5
1.2.1 Flammability Characteristics Prediction Using Quantitative Structure-Property
Relationship 5
1.2.2 Consequence Prediction Using Quantitative Property-Consequence
Relationship 6
1.2.3 Machine Learning in Process Safety and Asset Integrity Management 6
1.2.4 Machine Learning for Process Fault Detection and Diagnosis 7
1.2.5 Intelligent Method for Chemical Emission Source Identification 7
1.2.6 Machine Learning and Deep Learning Applications in Medical Image Analysis 7
1.2.7 Predictive Nanotoxicology: Nanoinformatics Approach to Toxicity Analysis of
Nanomaterials 8
1.2.8 Machine Learning in Environmental Exposure Assessment 8
1.2.9 Air Quality Prediction Using Machine Learning 8
1.3 Software and Tools 9
1.3.1 R 9
1.3.2 Python 12
References 13
2 Machine Learning Fundamentals 19
Yan Yan
2.1 What Is Learning? 19
2.1.1 Machine Learning Applications and Examples 20
2.1.2 Machine Learning Tasks 21
2.2 Concepts of Machine Learning 22
2.3 Machine Learning Paradigms 24
2.4 Probably Approximately Correct Learning 25
2.4.1 Deterministic Setting 26
2.4.2 Stochastic Setting 29
v
0005453285.3D 5 30/8/2022 8:51:33 PM
2.5 Estimation and Approximation 31
2.6 Empirical Risk Minimization 32
2.6.1 Empirical Risk Minimizer 32
2.6.2 VC-dimension Generalization Bound 33
2.6.3 General Loss Functions 34
2.7 Regularization 35
2.7.1 Regularized Loss Minimization 35
2.7.2 Constrained and Regularized Problem 36
2.7.3 Trade-off Between Estimation and Approximation Error 37
2.8 Maximum Likelihood Principle 38
2.8.1 Maximum Likelihood Estimation 39
2.8.2 Cross Entropy Minimization 40
2.9 Optimization 41
2.9.1 Linear Regression: An Example 42
2.9.2 Closed-form Solution 42
2.9.3 Gradient Descent 43
2.9.4 Stochastic Gradient Descent 45
References 46
3 Flammability Characteristics Prediction Using QSPR Modeling 47
Yong Pan and Juncheng Jiang
3.1 Introduction 47
3.1.1 Flammability Characteristics 47
3.1.2 QSPR Application 48
3.1.2.1 Concept of QSPR 48
3.1.2.2 Trends and Characteristics of QSPR 48
3.2 Flowchart for Flammability Characteristics Prediction 49
3.2.1 Dataset Preparation 51
3.2.2 Structure Input and Molecular Simulation 52
3.2.3 Calculation of Molecular Descriptors 53
3.2.4 Preliminary Screening of Molecular Descriptors 54
3.2.5 Descriptor Selection and Modeling 55
3.2.6 Model Validation 57
3.2.6.1 Model Fitting Ability Evaluation 57
3.2.6.2 Model Stability Analysis 59
3.2.6.3 Model Predictivity Evaluation 60
3.2.7 Model Mechanism Explanation 61
3.2.8 Summary of QSPR Process 61
3.3 QSPR Review for Flammability Characteristics 62
3.3.1 Flammability Limits 62
3.3.1.1 LFLT and LFL 62
3.3.1.2 UFLT and UFL 64
3.3.2 Flash Point 65
3.3.3 Auto-ignition Temperature 68
3.3.4 Heat of Combustion 69
vi Contents
0005453285.3D 6 30/8/2022 8:51:33 PM
3.3.5 Minimum Ignition Energy 70
3.3.6 Gas-liquid Critical Temperature 70
3.3.7 Other Properties 72
3.4 Limitations 72
3.5 Conclusions and Future Prospects 73
References 73
4 Consequence Prediction and Quantitative Property-Consequence Relationship
Models 81
Zeren Jiao and Qingsheng Wang
4.1 Introduction 81
4.2 Conventional Consequence Prediction Methods 82
4.2.1 Empirical Method 82
4.2.2 Computational Fluid Dynamics (CFD) Method 83
4.2.3 Integral Method 84
4.3 Machine Learning and Deep Learning-Based Consequence Prediction Models 84
4.4 Quantitative Property-Consequence Relationship Models 86
4.4.1 Consequence Database 88
4.4.2 Property Descriptors 89
4.4.3 Machine Learning and Deep Learning Algorithms 89
4.5 Challenges and Future Directions 90
References 91
5 Machine Learning in Process Safety and Asset Integrity Management 93
Ming Yang ,Hao Sun and Rustam Abubarkirov
5.1 Opportunities and Threats 93
5.2 State-of-the-Art Reviews 95
5.2.1 Artificial Neural Networks (ANNs) 95
5.2.2 Principal Component Analysis (PCA) 97
5.2.3 Genetic Algorithm (GA) 97
5.3 Case Study of Asset Integrity Assessment 98
5.4 Data-Driven Model of Asset Integrity Assessment 105
5.4.1 Condition Monitoring Data Collection 106
5.4.2 Data Processing and Storage 106
5.4.3 Data Mining for Risk Quantification and Monitoring Control 107
5.4.4 AIM Application 107
5.4.5 The Application of the Framework 108
5.5 Conclusion 109
References 109
6 Machine Learning for Process Fault Detection and Diagnosis 113
Rajeevan Arunthavanathan, Salim Ahmed, Faisal Khan and Syed Imtiaz
6.1 Background 113
6.2 Machine Learning Approaches in Fault Detection and Diagnosis 114
6.3 Supervised Methods for Fault Detection and Diagnosis 115
Contents vii
0005453285.3D 7 30/8/2022 8:51:33 PM
6.3.1 Neural Network 115
6.3.1.1 Neural Network Theory and Algorithm 115
6.3.1.2 Neural Network Learning for Fault Classification 117
6.3.1.3 Algorithm for Fault Classification Using Neural Network 118
6.3.2 Support Vector Machine 118
6.3.2.1 Support Vector Machine Theory and Algorithm 118
6.3.3 Support Vector Machine Model Selection and Algorithm 120
6.3.4 Support Vector Machine Multiclass Classification 121
6.4 Unsupervised Learning Models for Fault Detection and Diagnosis 122
6.4.1 K-Nearest Neighbors 122
6.4.2 One-Class Support Vector Machine 123
6.4.3 One-Class Neural Network 124
6.4.4 Comparison Between Deep Learning with Machine Learning in Fault Detection
and Diagnosis 126
6.5 Intelligent FDD Using Machine Learning 127
6.5.1 Model Development 127
6.5.2 Data Collection 129
6.5.2.1 Model Development Steps 129
6.5.2.2 Result Comparison 130
6.6 Concluding Remarks 134
References 134
7 Intelligent Method for Chemical Emission Source Identification 139
Denglong Ma
7.1 Introduction 139
7.1.1 Development of Detecting Gas Emission 139
7.1.2 Development of Source Term Identification 140
7.2 Intelligent Methods for Recognizing Gas Emission 141
7.2.1 Leakage Recognition of Sequestrated CO2 in the Atmosphere 141
7.2.1.1 Gas Leakage Recognition for CO2 Geological Sequestration 142
7.2.1.2 Case Studies for CO2 Recognition 144
7.2.2 Emission Gas Identification with Artificial Olfactory 149
7.2.2.1 Features of Responses in AOS 150
7.2.2.2 Support Vector Machine Models for Gas Identification 150
7.2.2.3 Deep Learning Models for Gas Identification 155
7.3 Intelligent Methods for Identifying Emission Sources 158
7.3.1 Source Estimation with Intelligent Optimization Method 158
7.3.1.1 Principle of Source Estimation with Optimization Method 158
7.3.1.2 Case Studies of Source Estimation with Optimization Method 159
7.3.2 Source Estimation with MRE-PSO Method 159
7.3.2.1 Principle of PSO-MRE for Source Estimation 161
7.3.2.2 Case Studies 163
7.3.3 Source Estimation with PSO-Tikhonov Regulation Method 164
7.3.3.1 Principle of PSO-Tikhonov Regularization Hybrid Method 164
7.3.3.2 Case Study 167
viii Contents
0005453285.3D 8 30/8/2022 8:51:33 PM
7.3.4 Source Estimation with MCMC-MLA Method 168
7.3.4.1 Forward Gas Dispersion Model Based on MLA 168
7.3.4.2 Source Estimation with MCMC-MLA Method 169
7.3.4.3 Case Study 172
7.4 Conclusions and Future Work 173
7.4.1 Conclusions 173
7.4.2 Limitations and Future Work 177
References 178
8 Machine Learning and Deep Learning Applications in Medical Image
Analysis 183
Pingfan Hu, Changjie Cai, Yu Feng and Qingsheng Wang
8.1 Introduction 183
8.1.1 Machine Learning in Medical Imaging 183
8.1.2 Deep Learning in Medical Imaging 183
8.2 CNN-Based Models for Classification 184
8.2.1 ResNet50 184
8.2.2 YOLOv4 (Darknet53) 185
8.2.3 Grad-CAM 186
8.3 Case Study 186
8.3.1 Background 186
8.3.2 Study Design 187
8.3.3 Training and Testing Database Preparation 187
8.3.4 Results 190
8.3.4.1 Classification Performance of the Modified ResNet50 Model 190
8.3.4.2 Classification Performance of the YOLOv4 Model 190
8.3.4.3 Post-Processing Via Grad-CAM Model and HSV 193
8.3.5 Conclusion 194
8.4 Limitations and Future Work 194
References 195
9 Predictive Nanotoxicology: Nanoinformatics Approach to Toxicity Analysis of
Nanomaterials 199
Bilal M. Khan and Yoram Cohen
9.1 Predictive Nanotoxicology 199
9.1.1 Introduction 199
9.1.2 Nano Quantitative Structure-Activity Relationship (QSAR) 200
9.1.3 Importance of Data for Nanotoxicology 204
9.2 Machine Learning Modeling for Predictive Nanotoxicology 205
9.2.1 Overview 205
9.2.2 Unsupervised Learning 211
9.2.2.1 Data Exploration Via Self-Organizing Maps (SOMs) 211
9.2.2.2 Evaluating Associations among Sublethal Toxicity Responses 214
9.2.3 Supervised Learning 215
9.2.3.1 Random Forest Models 216
Contents ix
0005453285.3D 9 30/8/2022 8:51:33 PM
9.2.3.2 Support Vector Machines 216
9.2.3.3 Bayesian Networks 216
9.2.3.4 Supervised Classification and Regression-Based Models for Nano-(Q)SARs 218
9.2.4 Predictive Nano-(Q)SARs for the Assessment of Causal Relationships 220
9.3 Development of Machine Learning Based Models for Nano-(Q)SARs 224
9.3.1 Overview 224
9.3.1.1 Data-Driven Models 224
9.3.1.2 Mechanistic/Theoretical Models 225
9.3.2 Data Generation, Collection, and Preprocessing 225
9.3.3 Descriptor Selection 226
9.3.4 Model Selection and Training 229
9.3.5 Model Validation 230
9.3.5.1 Descriptor Importance 231
9.3.5.2 Applicability Domain 231
9.3.6 Model Diagnosis and Debugging 231
9.4 Nanoinformatics Approaches to Predictive Nanotoxicology 234
9.5 Summary 235
References 238
10 Machine Learning in Environmental Exposure Assessment 251
Gregory L. Watson
10.1 Introduction 251
10.2 Environmental Exposure Modeling 252
10.3 Machine Learning Exposure Models 254
10.4 Model Evaluation 257
10.5 Case Study 258
10.6 Other Topics 260
10.6.1 Bias and Fairness 260
10.6.2 Wearable Sensors 260
10.6.3 Interpretability 260
10.6.4 Extreme Events 260
10.7 Conclusion 261
References 261
11 Air Quality Prediction Using Machine Learning 267
Lan Gao, Changjie Cai and Xiao-Ming Hu
11.1 Introduction 267
11.2 Air Quality and Climate Data Acquisition 269
11.2.1 Earth Satellite Observation Datasets 269
11.2.1.1 Basics of Earth Satellite Observations 269
11.2.1.2 Earth Satellite Products 270
11.2.2 Ground-Based In Situ Observation Datasets 276
11.2.2.1 Basics of the Ground-Based In Situ Observations 276
11.2.2.2 Ground-Based In Situ Products 277
11.3 Applications of Machine Learning in Air Quality Study 279
x Contents
0005453285.3D 10 30/8/2022 8:51:34 PM
11.3.1 Shallow Learning 280
11.3.2 Deep Learning 280
11.4 An Application Practice Example 281
11.4.1 Satellite Data Acquisition and Variable Selections 282
11.4.2 Machine Learning and Deep Learning Algorithms 282
References 283
12 Current Challenges and Perspectives 289
Changjie Cai and Qingsheng Wang
12.1 Current Challenges 289
12.1.1 Data Development and Cleaning 289
12.1.2 Hardware Issues 290
12.1.3 Data Confidentiality 290
12.1.4 Other Challenges 291
12.2 Perspectives 291
12.2.1 Real-Time Monitoring and Forecast of Chemical Hazards 291
12.2.2 Toolkits for Dummies 292
12.2.3 Physics-Informed Machine Learning 292
References 293
Index 000