Table of Contents
Preface xix
Acknowledgement xxi
Part 1 The Commencement of Machine Learning Solicitation to Bioinformatics 1
1 Introduction to Supervised Learning 3
Rajat Verma, Vishal Nagar and Satyasundara Mahapatra
1.1 Introduction 4
1.2 Learning Process & its Methodologies 5
1.3 Classification and its Types 10
1.4 Regression 12
1.5 Random Forest 18
1.6 K-Nearest Neighbor 20
1.7 Decision Trees 21
1.8 Support Vector Machines 22
1.9 Neural Networks 24
1.10 Comparison of Numerical Interpretation 26
1.11 Conclusion & Future Scope 27
References 28
2 Introduction to Unsupervised Learning in Bioinformatics 35
Nancy Anurag Parasa, Jaya Vinay Namgiri, Sachi Nandan Mohanty and Jatindra Kumar Dash
2.1 Introduction 36
2.2 Clustering in Unsupervised Learning 37
2.3 Clustering in Bioinformatics - Genetic Data 38
2.4 Conclusion 46
References 47
3 A Critical Review on the Application of Artificial Neural Network in Bioinformatics 51
Vrs Jhalia and Tripti Swarnkar
3.1 Introduction 52
3.2 Biological Datasets 57
3.3 Building Computational Model 58
3.4 Literature Review 64
3.5 Critical Analysis 72
3.6 Conclusion 73
References 73
Part 2 Machine Learning and Genomic Technology, Feature Selection and Dimensionality Reduction 77
4 Dimensionality Reduction Techniques: Principles, Benefits, and Limitations 79
Hemanta Kumar Palo, Santanu Sahoo and Asit Kumar Subudhi
4.1 Introduction 80
4.2 The Benefits and Limitations of Dimension Reduction Methods 81
4.3 Components of Dimension Reduction 83
4.4 Methods of Dimensionality Reduction 86
4.5 Conclusion 104
References 105
5 Plant Disease Detection Using Machine Learning Tools With an Overview on Dimensionality Reduction 109
Saurav Roy, Ratula Ray, Satya Ranjan Dash and Mrunmay Kumar Giri
5.1 Introduction 110
5.2 Flowchart 112
5.3 Machine Learning (ML) in Rapid Stress Phenotyping 113
5.4 Dimensionality Reduction 114
5.5 Literature Survey 116
5.6 Types of Plant Stress 128
5.7 Implementation I: Numerical Dataset 130
5.8 Implementation II: Image Dataset 134
5.9 Conclusion 140
References 141
6 Gene Selection Using Integrative Analysis of Multi-Level Omics Data: A Systematic Review 145
S. Mahapatra and T. Swarnkar
6.1 Introduction 146
6.2 Approaches for Gene Selection 147
6.3 Multi-Level Omics Data Integration 152
6.4 Machine Learning Approaches for Multi-Level Data Integration 153
6.5 Critical Observation 165
6.6 Conclusion 166
References 166
7 Random Forest Algorithm in Imbalance Genomics Classification 173
Sudhansu Shekhar Patra, Om Praksah Jena, Gaurav Kumar, Sreyashi Pramanik, Chinmaya Misra and Kamakhya Narain Singh
7.1 Introduction 173
7.2 Methodological Issues 175
7.3 Biological Terminologies 181
7.4 Proposed Model 183
7.5 Experimental Analysis 186
7.6 Current and Future Scope of ML in Genomics 188
7.7 Conclusion 189
References 189
8 Feature Selection and Random Forest Classification for Breast Cancer Disease 191
Shubham Raj, Swati Singh, Avinash Kumar, Sobhangi Sarkar and Chittaranjan Pradhan
8.1 Introduction 192
8.2 Literature Survey 192
8.3 Machine Learning 196
8.4 Feature Engineering 202
8.5 Methodology 204
8.6 Result Analysis 209
8.7 Conclusion 210
References 210
9 A Comprehensive Study on the Application of Grey Wolf Optimization for Microarray Data 211
Swati Sucharita, Barnali Sahu and Tripti Swarnkar
9.1 Introduction 212
9.2 Microarray Data 213
9.3 Grey Wolf Optimization (GWO) Algorithm 214
9.4 Studies on GWO Variants 220
9.5 Application of GWO in Medical Domain 232
9.6 Application of GWO in Microarray Data 232
9.7 Conclusion and Future Work 232
References 243
10 The Cluster Analysis and Feature Selection: Perspective of Machine Learning and Image Processing 249
Aradhana Behura
10.1 Introduction 251
10.2 Various Image Segmentation Techniques 254
10.3 How to Deal With Image Dataset 256
10.4 Class Imbalance Problem 264
10.5 Optimization of Hyperparameter 267
10.6 Case Study 270
10.7 Using AI to Detect Coronavirus 273
10.8 Using Artificial Intelligence (AI), CT Scan and X-Ray 274
10.9 Conclusion 276
References 276
Part 3 Machine Learning and Healthcare Applications 281
11 Artificial Intelligence and Machine Learning for Healthcare Solutions 283
Ashok Sharma, Parveen Singh and Gowhar Dar
11.1 Introduction 284
11.2 Using Machine Learning Approaches for Different Purposes 284
11.3 Various Resources of Medical Data Set for Research 286
11.4 Deep Learning in Healthcare 287
11.5 Various Projects in Medical Imaging and Diagnostics 288
11.6 Conclusion 289
References 290
12 Forecasting of Novel Corona Virus Disease (COVID-19) Using LSTM and XG Boosting Algorithms 293
V. Aakash, S. Sridevi, G. Ananthi and S. Rajaram
12.1 Introduction 294
12.2 Machine Learning Algorithms for Forecasting 296
12.3 Proposed Method 300
12.4 Implementation 304
12.5 Results and Discussion 307
12.6 Conclusion and Future Work 310
References 310
13 An Innovative Machine Learning Approach to Diagnose Cancer at Early Stage 313
Poongodi, P., Udayakumar, E., Srihari, K. and Sachi Nandan Mohanty
13.1 Introduction 314
13.2 Related Work 317
13.3 Materials and Methods 320
13.4 System Design 322
13.5 Results and Discussion 331
13.6 Conclusion 335
References 335
14 A Study of Human Sleep Staging Behavior Based on Polysomnography Using Machine Learning Techniques 339
Santosh Kumar Satapathy and D. Loganathan
14.1 Introduction 340
14.2 Polysomnography Signal Analysis 341
14.3 Case Study on Automated Sleep Stage Scoring 349
14.4 Summary and Conclusion 356
References 357
15 Detection of Schizophrenia Using EEG Signals 359
Shalini Mahato, Laxmi Kumari Pathak and Kajal Kumari
15.1 Introduction 360
15.2 Methodology 367
15.3 Literature Review 372
15.4 Discussion 372
15.5 Conclusion 388
References 388
16 Performance Analysis of Signal Processing Techniques in Bioinformatics for Medical Applications Using Machine Learning Concepts 391
G. Aparna, G. Anitha Mary and G. Sumana
16.1 Introduction 392
16.2 Basic Definition of Anatomy and Cell at Micro Level 397
16.3 Signal Processing - Genome Signal Processing 403
16.4 Hotspots Identification Algorithm 414
16.5 Results - Experimental Investigations 416
16.6 Analysis Using Machine Learning Metrics 418
16.7 Conclusion 424
Appendix 424
A.1 Hotspot Identification Code 424
A.2 Performance Metrics Code 425
References 427
17 Survey of Various Statistical Numerical and Machine Learning Ontological Models on Infectious Disease Ontology 431
Yuvaraj Natarajan, Srihari Kannan and Sachi Nandan Mohanty
17.1 Introduction 432
17.2 Disease Ontology 432
17.3 Infectious Disease Ontology 433
17.4 Biomedical Ontologies on IDO 434
17.5 Various Methods on IDO 435
17.6 Machine Learning-Based Ontology for IDO 436
17.7 Recommendation or Suggestions for Future Study 437
17.8 Conclusions 438
References 438
18 An Efficient Model for Predicting Liver Disease Using Machine Learning 443
Ritesh Choudhary, T. Gopalakrishnan, D. Ruby, A. Gayathri, Vishnu Srinivasa Murthy and Rishabh Shekhar
18.1 Introduction 444
18.2 Related Works 445
18.3 Proposed Model 446
18.4 Results and Analysis 454
18.5 Conclusion 456
References 456
Part 4 Bioinformatics and Market Analysis 459
19 A Novel Approach for Prediction of Stock Market Behavior Using Bioinformatics Techniques 461
Prakash Kumar Sarangi, Birendra Kumar Nayak and Sachidananda Dehuri
19.1 Introduction 462
19.2 Literature Review 463
19.3 Proposed Work 466
19.4 Experimental Study 470
19.5 Conclusion and Future Work 482
References 484
20 Stock Market Price Behavior Prediction Using Markov Models: A Bioinformatics Approach 485
Prakash Kumar Sarangi, Birendra Kumar Nayak and Sachidananda Dehuri
20.1 Introduction 486
20.2 Literature Survey 487
20.3 Proposed Work 488
20.4 Experimental Work 497
20.5 Conclusions and Future Work 504
References 505
Index 507