+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

Data Analysis and Related Applications, Volume 1. Computational, Algorithmic and Applied Economic Data Analysis. Edition No. 1

  • Book

  • 480 Pages
  • October 2022
  • John Wiley and Sons Ltd
  • ID: 5839184
The scientific field of data analysis is constantly expanding due to the rapid growth of the computer industry and the wide applicability of computational and algorithmic techniques, in conjunction with new advances in statistical, stochastic and analytic tools. There is a constant need for new, high-quality publications to cover the recent advances in all fields of science and engineering.

This book is a collective work by a number of leading scientists, computer experts, analysts, engineers, mathematicians, probabilists and statisticians who have been working at the forefront of data analysis and related applications. The chapters of this collaborative work represent a cross-section of current concerns, developments and research interests in the above scientific areas. The collected material has been divided into appropriate sections to provide the reader with both theoretical and applied information on data analysis methods, models and techniques, along with related applications.

Table of Contents

Preface xvii
Konstantinos N. ZAFEIRIS, Yiannis DIMOTIKALIS, Christos H. SKIADAS, Alex KARAGRIGORIOU and Christiana KARAGRIGORIOU-VONTA

Part 1 1

Chapter 1. Performance of Evaluation of Diagnosis of Various Thyroid Diseases Using Machine Learning Techniques 3
Burcu Bektas GÜNEŞ, Evren BURSUK and Rüya ŞAMLI

1.1. Introduction 3

1.2. Data understanding 5

1.3. Modeling 6

1.4. Findings 8

1.5. Conclusion 10

1.6. References 10

Chapter 2. Exploring Chronic Diseases’ Spatial Patterns: Thyroid Cancer in Sicilian Volcanic Areas 13
Francesca BITONTI and Angelo MAZZA

2.1. Introduction 14

2.2. Epidemiological data and territory 16

2.3. Methodology 18

2.3.1. Spatial inhomogeneity and spatial dependence 18

2.3.2. Standardized incidence ratio (SIR) 19

2.3.3. Local Moran’s I statistic 21

2.4. Spatial distribution of TC in eastern Sicily 22

2.4.1. SIR geographical variation 22

2.4.2. Estimate of the spatial attraction 24

2.5. Conclusion 25

2.6. References 26

Chapter 3. Analysis of Blockchain-based Databases in Web Applications 31
Orhun Ceng BOZO and Rüya ŞAMLI

3.1. Introduction 31

3.2. Background 32

3.2.1. Blockchain 32

3.2.2. Blockchain types 32

3.2.3. Blockchain-based web applications 33

3.2.4. Blockchain consensus algorithms 33

3.2.5. Other consensus algorithms 34

3.3. Analysis stack 34

3.3.1. Art Shop web application 34

3.3.2. SQL-based application 34

3.3.3. NoSQL-based application 35

3.3.4. Blockchain-based application 35

3.4. Analysis 36

3.4.1. Adding records 36

3.4.2. Query 38

3.4.3. Functionality 39

3.4.4. Security 39

3.5. Conclusion 41

3.6. References 41

Chapter 4. Optimization and Asymptotic Analysis of Insurance Models 43
Ekaterina BULINSKAYA

4.1. Introduction 43

4.2. Discrete-time model with reinsurance and bank loans 44

4.2.1. Model description 44

4.2.2. Optimization problem 45

4.2.3. Model stability 46

4.3. Continuous-time insurance model with dividends 48

4.3.1. Model description 48

4.3.2. Optimal barrier strategy 49

4.3.3. Special form of claim distribution 50

4.3.4. Numerical analysis 54

4.4. Conclusion and further research directions 55

4.5. References 56

Chapter 5. Statistical Analysis of Traffic Volume in the 25 de Abril Bridge 57
Frederico CAEIRO, Ayana MATEUS and Conceicao VEIGA de ALMEIDA

5.1. Introduction 57

5.2. Data 58

5.3. Methodology 60

5.3.1. Main limit results 60

5.3.2. Block maxima method 61

5.3.3. Largest order statistics method 62

5.3.4. Estimation of other tail parameters 63

5.4. Results and conclusion 63

5.5. Acknowledgements 65

5.6. References 65

Chapter 6. Predicting the Risk of Gestational Diabetes Mellitus through Nearest Neighbor Classification 67
Louisa TESTA, Mark A. CARUANA, Maria KONTORINAKI and Charles SAVONA-VENTURA

6.1. Introduction 67

6.2. Nearest neighbor methods 69

6.2.1. Background of the NN methods 69

6.2.2. The k-nearest neighbors method 70

6.2.3. The fixed-radius NN method 70

6.2.4. The kernel-NN method 71

6.2.5. Algorithms of the three considered NN methods 72

6.2.6. Parameter and distance metric selection 74

6.3. Experimental results 75

6.3.1. Dataset description 75

6.3.2. Variable selection and data splitting 75

6.3.3. Results 76

6.3.4. A discussion and comparison of results 78

6.4. Conclusion 79

6.5. References 79

Chapter 7. Political Trust in National Institutions: The Significance of Items’ Level of Measurement in the Validation of Constructs 81
Anastasia CHARALAMPI, Eva TSOUPAROPOULOU, Joanna TSIGANOU and Catherine MICHALOPOULOU

7.1. Introduction 82

7.2. Methods 83

7.2.1. Participants 83

7.2.2. Instrument 84

7.2.3. Statistical analyses 85

7.3. Results 87

7.3.1. EFA results 87

7.3.2. CFA results 88

7.3.3. Scale construction and assessment 91

7.4. Conclusion 94

7.5. Funding 95

7.6. References 95

Chapter 8. The State of the Art in Flexible Regression Models for Univariate Bounded Responses 99
Agnese Maria DI BRISCO, Roberto ASCARI, Sonia MIGLIORATI and Andrea ONGARO

8.1. Introduction 100

8.2. Regression model for bounded responses 101

8.2.1. Augmentation 102

8.2.2. Main distributions on the bounded support 103

8.2.3. Inference and fit 106

8.3. Case studies 107

8.3.1. Stress data 107

8.3.2. Reading data 110

8.4. References 112

Chapter 9. Simulation Studies for a Special Mixture Regression Model with Multivariate Responses on the Simplex 115
Agnese Maria DI BRISCO, Roberto ASCARI, Sonia MIGLIORATI and Andrea ONGARO

9.1. Introduction 115

9.2. Dirichlet and EFD distributions 116

9.3. Dirichlet and EFD regression models 118

9.3.1. Inference and fit 118

9.4. Simulation studies 119

9.4.1. Comments 124

9.5. References 131

Part 2 133

Chapter 10. Numerical Studies of Implied Volatility Expansions Under the Gatheral Model 135
Marko DIMITROV, Mohammed ALBUHAYRI, Ying NI and Anatoliy MALYARENKO

10.1. Introduction 135

10.2. Asymptotic expansions of implied volatility 137

10.3. Performance of the asymptotic expansions 139

10.4. Calibration using the asymptotic expansions 141

10.4.1. A partial calibration procedure 142

10.4.2. Calibration to synthetic and market data 143

10.5. Conclusion and future work 147

10.6. References 148

Chapter 11. Performance Persistence of Polish Mutual Funds: Mobility Measures 149
Dariusz FILIP

11.1. Introduction 149

11.2. Literature review 150

11.3. Dataset and empirical design 153

11.4. Empirical results 155

11.5. Monthly perspective 156

11.6. Quarterly perspective 157

11.7. Yearly perspective 158

11.8. Conclusion 159

11.9. References 159

Chapter 12. Invariant Description for a Batch Version of the UCB Strategy with Unknown Control Horizon 163
Sergey GARBAR

12.1. Introduction 163

12.2. UCB strategy 165

12.3. Batch version of the strategy 165

12.4. Invariant description with a unit control horizon 166

12.5. Simulation results 169

12.6. Conclusion 170

12.7. Affiliations 171

12.8. References 171

Chapter 13. A New Non-monotonic Link Function for Beta Regressions 173
Gloria GHENO

13.1. Introduction 174

13.2. Model 175

13.3. Estimation 178

13.4. Comparison 179

13.5. Conclusion 184

13.6. References 184

Chapter 14. A Method of Big Data Collection and Normalizatio nfor Electronic Engineering Applications 187
Naveenbalaji GOWTHAMAN and Viranjay M. SRIVASTAVA

14.1. Introduction 187

14.2. Machine learning (ML) in electronic engineering 189

14.2.1. Data acquisition 190

14.2.2. Accessing the data repositories 191

14.2.3. Data storage and management 192

14.3. Electronic engineering applications - data science 193

14.4. Conclusion and future work 195

14.5. References 195

Chapter 15. Stochastic Runge-Kutta Solvers Based on Markov Jump Processes and Applications to Non-autonomous Systems of Differential Equations 199
Flavius GUIAŞ

15.1. Introduction 199

15.2. Description of the method 201

15.2.1. The direct simulation method 201

15.2.2. Picard iterations 201

15.2.3. Runge-Kutta steps 202

15.3. Numerical examples 203

15.3.1. The Lorenz system 203

15.3.2. A combustion model 204

15.4. Conclusion 206

15.5. References 206

Chapter 16. Interpreting a Topological Measure of Complexity for Decision Boundaries 207
Alan HYLTON, Ian LIM, Michael MOY and Robert SHORT

16.1. Introduction 207

16.2. Persistent homology 209

16.3. Methodology 213

16.3.1. Neural networks and binary classification 213

16.3.2. Persistent homology of a decision boundary 213

16.3.3. Procedure 214

16.4. Experiments and results 215

16.4.1. Three-dimensional binary classification 215

16.4.2. Data divided by a hyperplane 217

16.5. Conclusion and discussion 219

16.6. References 220

Chapter 17. The Minimum Renyi’s Pseudodistance Estimators for Generalized Linear Models 223
María JAENADA and Leandro PARDO

17.1. Introduction 223

17.2. The minimum RP estimators for the GLM model: asymptotic distribution 225

17.3. Example: Poisson regression model 230

17.3.1. Real data application 230

17.4. Conclusion 232

17.5. Acknowledgments 232

17.6. Appendix 232

17.6.1. Proof of Theorem 1 232

17.7. References 234

Chapter 18. Data Analysis based on Entropies and Measures of Divergence 237
Christos MESELIDIS, Alex KARAGRIGORIOU and Takis PAPAIOANNOU

18.1. Introduction 237

18.2. Divergence measures 238

18.3. Tests of fit based on Φ-divergence measures 241

18.4. Simulations 246

18.5. References 254

Part 3 259

Chapter 19. Geographically Weighted Regression for Official Land Prices and their Temporal Variation in Tokyo 261
Yuta KANNO and Takayuki SHIOHAMA

19.1. Introduction 261

19.2. Models and methodology 263

19.3. Data analysis 266

19.3.1. Data 266

19.3.2. Results 268

19.4. Conclusion 272

19.5. Acknowledgments 273

19.6. References 273

Chapter 20. Software Cost Estimation Using Machine Learning Algorithms 275
Sukran EBREN KARA and Rüya ŞAMLI

20.1. Introduction 275

20.2. Methodology 276

20.2.1. Dataset 276

20.2.2. Model 277

20.2.3. Evaluating the performance of the model 278

20.3. Results and discussion 279

20.4. Conclusion 282

20.5. References 283

Chapter 21. Monte Carlo Accuracy Evaluation of Laser Cutting Machine 285
Samuel KOSOLAPOV

21.1. Introduction 286

21.2. Mathematical model of a pintograph 286

21.3. Monte Carlo simulator 291

21.4. Simulation results 294

21.5. Conclusion 295

21.6. Acknowledgments 295

21.7. References 295

Chapter 22. Using Parameters of Piecewise Approximation by Exponents for Epidemiological Time Series Data Analysis 297
Samuel KOSOLAPOV

22.1. Introduction 298

22.2. Deriving equations for moving exponent parameters 298

22.3. Validation of derived equations by using synthetic data 300

22.4. Using derived equations to analyze real-life Covid-19 data 302

22.5. Conclusion 305

22.6. References 306

Chapter 23. The Correlation Between Oxygen Consumption and Excretion of Carbon Dioxide in the Human Respiratory Cycle 307
Anatoly KOVALENKO, Konstantin LEBEDINSKII and Verangelina MOLOSHNEVA

23.1. Introduction 308

23.2. Respiratory function physiology: ventilation-perfusion ratio 309

23.3. The basic principle of operation of artificial lung ventilation devices: patient monitoring parameters 310

23.4. The algorithm for monitoring the carbon emissions and oxygen consumption 312

23.5. Results 314

23.6. Conclusion 316

23.7. References 316

Part 4 317

Chapter 24. Approximate Bayesian Inference Using the Mean-Field Distribution 319
Antonin DELLA NOCE and Paul-Henry COURNÈDE

24.1. Introduction 319

24.2. Inference problem in a symmetric population system 321

24.2.1. Example of a symmetric system describing plant competition 321

24.2.2. Inference problem of the Schneider system, in a more general setting 323

24.3. Properties of the mean-field distribution 325

24.4. Mean-field approximated inference 327

24.4.1. Case of systems admitting a mean-field limit 327

24.5. Conclusion 330

24.6. References 330

Chapter 25. Pricing Financial Derivatives in the Hull-White Model Using Cubature Methods on Wiener Space 333
Hossein NOHROUZIAN, Anatoliy MALYARENKO and Ying NI

25.1. Introduction and outline 333

25.2. Cubature formulae on Wiener space 335

25.2.1. A simple example of classical Monte Carlo estimates 335

25.2.2. Modern Monte Carlo estimates via cubature method 336

25.2.3. An application in the Black-Scholes SDE 338

25.2.4. Trajectories of the cubature formula of degree 5 on Wiener space 339

25.2.5. Trajectories of price process given in equation [25.7] 340

25.2.6. An application on path-dependent derivatives 341

25.2.7. Trinomial tree (model) via cubature formulae of degree 5 342

25.3. Interest-rate models and Hull-White one-factor model 343

25.3.1. Equilibrium models 343

25.3.2. No-arbitrage models 344

25.3.3. Forward rate models 345

25.3.4. Hull-White one-factor model 345

25.3.5. Discretization of the Hull-White model via Euler scheme 346

25.3.6. Hull-White model for bond prices 346

25.4. The Hull-White model via cubature method 349

25.4.1. Simulating SDE [25.15] and ODE [25.24] 350

25.4.2. The Hull-White interest-rate tree via iterated cubature formulae: some examples 353

25.5. Discussion and future works 354

25.6. References 355

Chapter 26. Differences in the Structure of Infectious Morbidity of the Population during the First and Second Half of 2020 in St. Petersburg 359
Vasilii OREL, Olga NOSYREVA, Tatiana BULDAKOVA, Natalya GUREVA, Viktoria SMIRNOVA, Andrey KIM and Lubov SHARAFUTDINOVA

26.1. Introduction 360

26.2. Materials and methods 360

26.2.1. Characteristics of the territory of the district 360

26.2.2. Demographic characteristics of the area 360

26.2.3. Characteristics of the district medical service 361

26.2.4. The procedure for collecting primary information on cases of diseases of the population with a new coronavirus infection 361

26.3. Results of the analysis of the incidence of acute respiratory viral infectious diseases, new coronavirus infection Covid-19 and community-acquired pneumonia 362

26.4. Conclusion 367

26.5. References 368

Chapter 27. High Speed and Secured Network Connectivity for Higher Education Institutions Using Software Defined Networks 371
Lincoln S. PETER and Viranjay M. SRIVASTAVA

27.1. Introduction 372

27.2. Existing model review 373

27.3. Selection of a suitable model 374

27.4. Conclusion and future recommendations 376

27.5. References 376

Chapter 28. Reliability of a Double Redundant System Under the Full Repair Scenario 379
Vladimir RYKOV and Nika IVANOVA

28.1. Introduction 379

28.2. Problem statement, assumptions and notations 381

28.3. Reliability function 384

28.4. Time-dependent system state probabilities 386

28.4.1. General representation of t.d.s.p.s 386

28.4.2. T.d.s.p.s in a separate regeneration period 387

28.5. Steady-state probabilities 392

28.6. Conclusion 393

28.7. References 393

Chapter 29. Predicting Changes in Depression Levels Following the European Economic Downturn of 2008 395
Eleni SERAFETINIDOU and Georgia VERROPOULOU

29.1. Introduction 396

29.1.1. Aims of the study 398

29.2. Data and methods 398

29.2.1. Sample 398

29.2.2. Measures 398

29.3. Results 400

29.3.1. Descriptive findings 400

29.3.2. Non-respondents compared to respondents at baseline (wave 2) 403

29.3.3. Descriptive findings for respondents - analysis by gender 405

29.3.4. Findings regarding decreasing depression levels - analysis for the total sample and by gender 408

29.3.5. Findings regarding increasing depression levels - analysis for the total sample and by gender 410

29.4. Discussion 413

29.5. Conclusion 414

29.6. Acknowledgments 415

29.7. References 415

List of Authors 419

Index 425

Summary of Volume 2 429

Authors

Konstantinos N. Zafeiris Christos H. Skiadas Yiannis Dimotikalis Alex Karagrigoriou Christiana Karagrigoriou-Vonta