This book is a collective work by a number of leading scientists, computer experts, analysts, engineers, mathematicians, probabilists and statisticians who have been working at the forefront of data analysis and related applications. The chapters of this collaborative work represent a cross-section of current concerns, developments and research interests in the above scientific areas. The collected material has been divided into appropriate sections to provide the reader with both theoretical and applied information on data analysis methods, models and techniques, along with related applications.
Table of Contents
Preface xvii
Konstantinos N. ZAFEIRIS, Yiannis DIMOTIKALIS, Christos H. SKIADAS, Alex KARAGRIGORIOU and Christiana KARAGRIGORIOU-VONTA
Part 1 1
Chapter 1. Performance of Evaluation of Diagnosis of Various Thyroid Diseases Using Machine Learning Techniques 3
Burcu Bektas GÜNEŞ, Evren BURSUK and Rüya ŞAMLI
1.1. Introduction 3
1.2. Data understanding 5
1.3. Modeling 6
1.4. Findings 8
1.5. Conclusion 10
1.6. References 10
Chapter 2. Exploring Chronic Diseases’ Spatial Patterns: Thyroid Cancer in Sicilian Volcanic Areas 13
Francesca BITONTI and Angelo MAZZA
2.1. Introduction 14
2.2. Epidemiological data and territory 16
2.3. Methodology 18
2.3.1. Spatial inhomogeneity and spatial dependence 18
2.3.2. Standardized incidence ratio (SIR) 19
2.3.3. Local Moran’s I statistic 21
2.4. Spatial distribution of TC in eastern Sicily 22
2.4.1. SIR geographical variation 22
2.4.2. Estimate of the spatial attraction 24
2.5. Conclusion 25
2.6. References 26
Chapter 3. Analysis of Blockchain-based Databases in Web Applications 31
Orhun Ceng BOZO and Rüya ŞAMLI
3.1. Introduction 31
3.2. Background 32
3.2.1. Blockchain 32
3.2.2. Blockchain types 32
3.2.3. Blockchain-based web applications 33
3.2.4. Blockchain consensus algorithms 33
3.2.5. Other consensus algorithms 34
3.3. Analysis stack 34
3.3.1. Art Shop web application 34
3.3.2. SQL-based application 34
3.3.3. NoSQL-based application 35
3.3.4. Blockchain-based application 35
3.4. Analysis 36
3.4.1. Adding records 36
3.4.2. Query 38
3.4.3. Functionality 39
3.4.4. Security 39
3.5. Conclusion 41
3.6. References 41
Chapter 4. Optimization and Asymptotic Analysis of Insurance Models 43
Ekaterina BULINSKAYA
4.1. Introduction 43
4.2. Discrete-time model with reinsurance and bank loans 44
4.2.1. Model description 44
4.2.2. Optimization problem 45
4.2.3. Model stability 46
4.3. Continuous-time insurance model with dividends 48
4.3.1. Model description 48
4.3.2. Optimal barrier strategy 49
4.3.3. Special form of claim distribution 50
4.3.4. Numerical analysis 54
4.4. Conclusion and further research directions 55
4.5. References 56
Chapter 5. Statistical Analysis of Traffic Volume in the 25 de Abril Bridge 57
Frederico CAEIRO, Ayana MATEUS and Conceicao VEIGA de ALMEIDA
5.1. Introduction 57
5.2. Data 58
5.3. Methodology 60
5.3.1. Main limit results 60
5.3.2. Block maxima method 61
5.3.3. Largest order statistics method 62
5.3.4. Estimation of other tail parameters 63
5.4. Results and conclusion 63
5.5. Acknowledgements 65
5.6. References 65
Chapter 6. Predicting the Risk of Gestational Diabetes Mellitus through Nearest Neighbor Classification 67
Louisa TESTA, Mark A. CARUANA, Maria KONTORINAKI and Charles SAVONA-VENTURA
6.1. Introduction 67
6.2. Nearest neighbor methods 69
6.2.1. Background of the NN methods 69
6.2.2. The k-nearest neighbors method 70
6.2.3. The fixed-radius NN method 70
6.2.4. The kernel-NN method 71
6.2.5. Algorithms of the three considered NN methods 72
6.2.6. Parameter and distance metric selection 74
6.3. Experimental results 75
6.3.1. Dataset description 75
6.3.2. Variable selection and data splitting 75
6.3.3. Results 76
6.3.4. A discussion and comparison of results 78
6.4. Conclusion 79
6.5. References 79
Chapter 7. Political Trust in National Institutions: The Significance of Items’ Level of Measurement in the Validation of Constructs 81
Anastasia CHARALAMPI, Eva TSOUPAROPOULOU, Joanna TSIGANOU and Catherine MICHALOPOULOU
7.1. Introduction 82
7.2. Methods 83
7.2.1. Participants 83
7.2.2. Instrument 84
7.2.3. Statistical analyses 85
7.3. Results 87
7.3.1. EFA results 87
7.3.2. CFA results 88
7.3.3. Scale construction and assessment 91
7.4. Conclusion 94
7.5. Funding 95
7.6. References 95
Chapter 8. The State of the Art in Flexible Regression Models for Univariate Bounded Responses 99
Agnese Maria DI BRISCO, Roberto ASCARI, Sonia MIGLIORATI and Andrea ONGARO
8.1. Introduction 100
8.2. Regression model for bounded responses 101
8.2.1. Augmentation 102
8.2.2. Main distributions on the bounded support 103
8.2.3. Inference and fit 106
8.3. Case studies 107
8.3.1. Stress data 107
8.3.2. Reading data 110
8.4. References 112
Chapter 9. Simulation Studies for a Special Mixture Regression Model with Multivariate Responses on the Simplex 115
Agnese Maria DI BRISCO, Roberto ASCARI, Sonia MIGLIORATI and Andrea ONGARO
9.1. Introduction 115
9.2. Dirichlet and EFD distributions 116
9.3. Dirichlet and EFD regression models 118
9.3.1. Inference and fit 118
9.4. Simulation studies 119
9.4.1. Comments 124
9.5. References 131
Part 2 133
Chapter 10. Numerical Studies of Implied Volatility Expansions Under the Gatheral Model 135
Marko DIMITROV, Mohammed ALBUHAYRI, Ying NI and Anatoliy MALYARENKO
10.1. Introduction 135
10.2. Asymptotic expansions of implied volatility 137
10.3. Performance of the asymptotic expansions 139
10.4. Calibration using the asymptotic expansions 141
10.4.1. A partial calibration procedure 142
10.4.2. Calibration to synthetic and market data 143
10.5. Conclusion and future work 147
10.6. References 148
Chapter 11. Performance Persistence of Polish Mutual Funds: Mobility Measures 149
Dariusz FILIP
11.1. Introduction 149
11.2. Literature review 150
11.3. Dataset and empirical design 153
11.4. Empirical results 155
11.5. Monthly perspective 156
11.6. Quarterly perspective 157
11.7. Yearly perspective 158
11.8. Conclusion 159
11.9. References 159
Chapter 12. Invariant Description for a Batch Version of the UCB Strategy with Unknown Control Horizon 163
Sergey GARBAR
12.1. Introduction 163
12.2. UCB strategy 165
12.3. Batch version of the strategy 165
12.4. Invariant description with a unit control horizon 166
12.5. Simulation results 169
12.6. Conclusion 170
12.7. Affiliations 171
12.8. References 171
Chapter 13. A New Non-monotonic Link Function for Beta Regressions 173
Gloria GHENO
13.1. Introduction 174
13.2. Model 175
13.3. Estimation 178
13.4. Comparison 179
13.5. Conclusion 184
13.6. References 184
Chapter 14. A Method of Big Data Collection and Normalizatio nfor Electronic Engineering Applications 187
Naveenbalaji GOWTHAMAN and Viranjay M. SRIVASTAVA
14.1. Introduction 187
14.2. Machine learning (ML) in electronic engineering 189
14.2.1. Data acquisition 190
14.2.2. Accessing the data repositories 191
14.2.3. Data storage and management 192
14.3. Electronic engineering applications - data science 193
14.4. Conclusion and future work 195
14.5. References 195
Chapter 15. Stochastic Runge-Kutta Solvers Based on Markov Jump Processes and Applications to Non-autonomous Systems of Differential Equations 199
Flavius GUIAŞ
15.1. Introduction 199
15.2. Description of the method 201
15.2.1. The direct simulation method 201
15.2.2. Picard iterations 201
15.2.3. Runge-Kutta steps 202
15.3. Numerical examples 203
15.3.1. The Lorenz system 203
15.3.2. A combustion model 204
15.4. Conclusion 206
15.5. References 206
Chapter 16. Interpreting a Topological Measure of Complexity for Decision Boundaries 207
Alan HYLTON, Ian LIM, Michael MOY and Robert SHORT
16.1. Introduction 207
16.2. Persistent homology 209
16.3. Methodology 213
16.3.1. Neural networks and binary classification 213
16.3.2. Persistent homology of a decision boundary 213
16.3.3. Procedure 214
16.4. Experiments and results 215
16.4.1. Three-dimensional binary classification 215
16.4.2. Data divided by a hyperplane 217
16.5. Conclusion and discussion 219
16.6. References 220
Chapter 17. The Minimum Renyi’s Pseudodistance Estimators for Generalized Linear Models 223
María JAENADA and Leandro PARDO
17.1. Introduction 223
17.2. The minimum RP estimators for the GLM model: asymptotic distribution 225
17.3. Example: Poisson regression model 230
17.3.1. Real data application 230
17.4. Conclusion 232
17.5. Acknowledgments 232
17.6. Appendix 232
17.6.1. Proof of Theorem 1 232
17.7. References 234
Chapter 18. Data Analysis based on Entropies and Measures of Divergence 237
Christos MESELIDIS, Alex KARAGRIGORIOU and Takis PAPAIOANNOU
18.1. Introduction 237
18.2. Divergence measures 238
18.3. Tests of fit based on Φ-divergence measures 241
18.4. Simulations 246
18.5. References 254
Part 3 259
Chapter 19. Geographically Weighted Regression for Official Land Prices and their Temporal Variation in Tokyo 261
Yuta KANNO and Takayuki SHIOHAMA
19.1. Introduction 261
19.2. Models and methodology 263
19.3. Data analysis 266
19.3.1. Data 266
19.3.2. Results 268
19.4. Conclusion 272
19.5. Acknowledgments 273
19.6. References 273
Chapter 20. Software Cost Estimation Using Machine Learning Algorithms 275
Sukran EBREN KARA and Rüya ŞAMLI
20.1. Introduction 275
20.2. Methodology 276
20.2.1. Dataset 276
20.2.2. Model 277
20.2.3. Evaluating the performance of the model 278
20.3. Results and discussion 279
20.4. Conclusion 282
20.5. References 283
Chapter 21. Monte Carlo Accuracy Evaluation of Laser Cutting Machine 285
Samuel KOSOLAPOV
21.1. Introduction 286
21.2. Mathematical model of a pintograph 286
21.3. Monte Carlo simulator 291
21.4. Simulation results 294
21.5. Conclusion 295
21.6. Acknowledgments 295
21.7. References 295
Chapter 22. Using Parameters of Piecewise Approximation by Exponents for Epidemiological Time Series Data Analysis 297
Samuel KOSOLAPOV
22.1. Introduction 298
22.2. Deriving equations for moving exponent parameters 298
22.3. Validation of derived equations by using synthetic data 300
22.4. Using derived equations to analyze real-life Covid-19 data 302
22.5. Conclusion 305
22.6. References 306
Chapter 23. The Correlation Between Oxygen Consumption and Excretion of Carbon Dioxide in the Human Respiratory Cycle 307
Anatoly KOVALENKO, Konstantin LEBEDINSKII and Verangelina MOLOSHNEVA
23.1. Introduction 308
23.2. Respiratory function physiology: ventilation-perfusion ratio 309
23.3. The basic principle of operation of artificial lung ventilation devices: patient monitoring parameters 310
23.4. The algorithm for monitoring the carbon emissions and oxygen consumption 312
23.5. Results 314
23.6. Conclusion 316
23.7. References 316
Part 4 317
Chapter 24. Approximate Bayesian Inference Using the Mean-Field Distribution 319
Antonin DELLA NOCE and Paul-Henry COURNÈDE
24.1. Introduction 319
24.2. Inference problem in a symmetric population system 321
24.2.1. Example of a symmetric system describing plant competition 321
24.2.2. Inference problem of the Schneider system, in a more general setting 323
24.3. Properties of the mean-field distribution 325
24.4. Mean-field approximated inference 327
24.4.1. Case of systems admitting a mean-field limit 327
24.5. Conclusion 330
24.6. References 330
Chapter 25. Pricing Financial Derivatives in the Hull-White Model Using Cubature Methods on Wiener Space 333
Hossein NOHROUZIAN, Anatoliy MALYARENKO and Ying NI
25.1. Introduction and outline 333
25.2. Cubature formulae on Wiener space 335
25.2.1. A simple example of classical Monte Carlo estimates 335
25.2.2. Modern Monte Carlo estimates via cubature method 336
25.2.3. An application in the Black-Scholes SDE 338
25.2.4. Trajectories of the cubature formula of degree 5 on Wiener space 339
25.2.5. Trajectories of price process given in equation [25.7] 340
25.2.6. An application on path-dependent derivatives 341
25.2.7. Trinomial tree (model) via cubature formulae of degree 5 342
25.3. Interest-rate models and Hull-White one-factor model 343
25.3.1. Equilibrium models 343
25.3.2. No-arbitrage models 344
25.3.3. Forward rate models 345
25.3.4. Hull-White one-factor model 345
25.3.5. Discretization of the Hull-White model via Euler scheme 346
25.3.6. Hull-White model for bond prices 346
25.4. The Hull-White model via cubature method 349
25.4.1. Simulating SDE [25.15] and ODE [25.24] 350
25.4.2. The Hull-White interest-rate tree via iterated cubature formulae: some examples 353
25.5. Discussion and future works 354
25.6. References 355
Chapter 26. Differences in the Structure of Infectious Morbidity of the Population during the First and Second Half of 2020 in St. Petersburg 359
Vasilii OREL, Olga NOSYREVA, Tatiana BULDAKOVA, Natalya GUREVA, Viktoria SMIRNOVA, Andrey KIM and Lubov SHARAFUTDINOVA
26.1. Introduction 360
26.2. Materials and methods 360
26.2.1. Characteristics of the territory of the district 360
26.2.2. Demographic characteristics of the area 360
26.2.3. Characteristics of the district medical service 361
26.2.4. The procedure for collecting primary information on cases of diseases of the population with a new coronavirus infection 361
26.3. Results of the analysis of the incidence of acute respiratory viral infectious diseases, new coronavirus infection Covid-19 and community-acquired pneumonia 362
26.4. Conclusion 367
26.5. References 368
Chapter 27. High Speed and Secured Network Connectivity for Higher Education Institutions Using Software Defined Networks 371
Lincoln S. PETER and Viranjay M. SRIVASTAVA
27.1. Introduction 372
27.2. Existing model review 373
27.3. Selection of a suitable model 374
27.4. Conclusion and future recommendations 376
27.5. References 376
Chapter 28. Reliability of a Double Redundant System Under the Full Repair Scenario 379
Vladimir RYKOV and Nika IVANOVA
28.1. Introduction 379
28.2. Problem statement, assumptions and notations 381
28.3. Reliability function 384
28.4. Time-dependent system state probabilities 386
28.4.1. General representation of t.d.s.p.s 386
28.4.2. T.d.s.p.s in a separate regeneration period 387
28.5. Steady-state probabilities 392
28.6. Conclusion 393
28.7. References 393
Chapter 29. Predicting Changes in Depression Levels Following the European Economic Downturn of 2008 395
Eleni SERAFETINIDOU and Georgia VERROPOULOU
29.1. Introduction 396
29.1.1. Aims of the study 398
29.2. Data and methods 398
29.2.1. Sample 398
29.2.2. Measures 398
29.3. Results 400
29.3.1. Descriptive findings 400
29.3.2. Non-respondents compared to respondents at baseline (wave 2) 403
29.3.3. Descriptive findings for respondents - analysis by gender 405
29.3.4. Findings regarding decreasing depression levels - analysis for the total sample and by gender 408
29.3.5. Findings regarding increasing depression levels - analysis for the total sample and by gender 410
29.4. Discussion 413
29.5. Conclusion 414
29.6. Acknowledgments 415
29.7. References 415
List of Authors 419
Index 425
Summary of Volume 2 429