+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

Data Analytics in Bioinformatics. A Machine Learning Perspective. Edition No. 1

  • Book

  • 544 Pages
  • March 2021
  • John Wiley and Sons Ltd
  • ID: 5840773
Machine learning techniques are increasingly being used to address problems in computational biology and bioinformatics. Novel machine learning computational techniques to analyze high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. Machine learning techniques such as Markov models, support vector machines, neural networks, and graphical models have been successful in analyzing life science data because of their capabilities in handling randomness and uncertainty of data noise and in generalization. Machine Learning in Bioinformatics compiles recent approaches in machine learning methods and their applications in addressing contemporary problems in bioinformatics approximating classification and prediction of disease, feature selection, dimensionality reduction, gene selection and classification of microarray data and many more.

Table of Contents

Preface xix

Acknowledgement xxi

Part 1 The Commencement of Machine Learning Solicitation to Bioinformatics 1

1 Introduction to Supervised Learning 3
Rajat Verma, Vishal Nagar and Satyasundara Mahapatra

1.1 Introduction 4

1.2 Learning Process & its Methodologies 5

1.3 Classification and its Types 10

1.4 Regression 12

1.5 Random Forest 18

1.6 K-Nearest Neighbor 20

1.7 Decision Trees 21

1.8 Support Vector Machines 22

1.9 Neural Networks 24

1.10 Comparison of Numerical Interpretation 26

1.11 Conclusion & Future Scope 27

References 28

2 Introduction to Unsupervised Learning in Bioinformatics 35
Nancy Anurag Parasa, Jaya Vinay Namgiri, Sachi Nandan Mohanty and Jatindra Kumar Dash

2.1 Introduction 36

2.2 Clustering in Unsupervised Learning 37

2.3 Clustering in Bioinformatics - Genetic Data 38

2.4 Conclusion 46

References 47

3 A Critical Review on the Application of Artificial Neural Network in Bioinformatics 51
Vrs Jhalia and Tripti Swarnkar

3.1 Introduction 52

3.2 Biological Datasets 57

3.3 Building Computational Model 58

3.4 Literature Review 64

3.5 Critical Analysis 72

3.6 Conclusion 73

References 73

Part 2 Machine Learning and Genomic Technology, Feature Selection and Dimensionality Reduction 77

4 Dimensionality Reduction Techniques: Principles, Benefits, and Limitations 79
Hemanta Kumar Palo, Santanu Sahoo and Asit Kumar Subudhi

4.1 Introduction 80

4.2 The Benefits and Limitations of Dimension Reduction Methods 81

4.3 Components of Dimension Reduction 83

4.4 Methods of Dimensionality Reduction 86

4.5 Conclusion 104

References 105

5 Plant Disease Detection Using Machine Learning Tools With an Overview on Dimensionality Reduction 109
Saurav Roy, Ratula Ray, Satya Ranjan Dash and Mrunmay Kumar Giri

5.1 Introduction 110

5.2 Flowchart 112

5.3 Machine Learning (ML) in Rapid Stress Phenotyping 113

5.4 Dimensionality Reduction 114

5.5 Literature Survey 116

5.6 Types of Plant Stress 128

5.7 Implementation I: Numerical Dataset 130

5.8 Implementation II: Image Dataset 134

5.9 Conclusion 140

References 141

6 Gene Selection Using Integrative Analysis of Multi-Level Omics Data: A Systematic Review 145
S. Mahapatra and T. Swarnkar

6.1 Introduction 146

6.2 Approaches for Gene Selection 147

6.3 Multi-Level Omics Data Integration 152

6.4 Machine Learning Approaches for Multi-Level Data Integration 153

6.5 Critical Observation 165

6.6 Conclusion 166

References 166

7 Random Forest Algorithm in Imbalance Genomics Classification 173
Sudhansu Shekhar Patra, Om Praksah Jena, Gaurav Kumar, Sreyashi Pramanik, Chinmaya Misra and Kamakhya Narain Singh

7.1 Introduction 173

7.2 Methodological Issues 175

7.3 Biological Terminologies 181

7.4 Proposed Model 183

7.5 Experimental Analysis 186

7.6 Current and Future Scope of ML in Genomics 188

7.7 Conclusion 189

References 189

8 Feature Selection and Random Forest Classification for Breast Cancer Disease 191
Shubham Raj, Swati Singh, Avinash Kumar, Sobhangi Sarkar and Chittaranjan Pradhan

8.1 Introduction 192

8.2 Literature Survey 192

8.3 Machine Learning 196

8.4 Feature Engineering 202

8.5 Methodology 204

8.6 Result Analysis 209

8.7 Conclusion 210

References 210

9 A Comprehensive Study on the Application of Grey Wolf Optimization for Microarray Data 211
Swati Sucharita, Barnali Sahu and Tripti Swarnkar

9.1 Introduction 212

9.2 Microarray Data 213

9.3 Grey Wolf Optimization (GWO) Algorithm 214

9.4 Studies on GWO Variants 220

9.5 Application of GWO in Medical Domain 232

9.6 Application of GWO in Microarray Data 232

9.7 Conclusion and Future Work 232

References 243

10 The Cluster Analysis and Feature Selection: Perspective of Machine Learning and Image Processing 249
Aradhana Behura

10.1 Introduction 251

10.2 Various Image Segmentation Techniques 254

10.3 How to Deal With Image Dataset 256

10.4 Class Imbalance Problem 264

10.5 Optimization of Hyperparameter 267

10.6 Case Study 270

10.7 Using AI to Detect Coronavirus 273

10.8 Using Artificial Intelligence (AI), CT Scan and X-Ray 274

10.9 Conclusion 276

References 276

Part 3 Machine Learning and Healthcare Applications 281

11 Artificial Intelligence and Machine Learning for Healthcare Solutions 283
Ashok Sharma, Parveen Singh and Gowhar Dar

11.1 Introduction 284

11.2 Using Machine Learning Approaches for Different Purposes 284

11.3 Various Resources of Medical Data Set for Research 286

11.4 Deep Learning in Healthcare 287

11.5 Various Projects in Medical Imaging and Diagnostics 288

11.6 Conclusion 289

References 290

12 Forecasting of Novel Corona Virus Disease (COVID-19) Using LSTM and XG Boosting Algorithms 293
V. Aakash, S. Sridevi, G. Ananthi and S. Rajaram

12.1 Introduction 294

12.2 Machine Learning Algorithms for Forecasting 296

12.3 Proposed Method 300

12.4 Implementation 304

12.5 Results and Discussion 307

12.6 Conclusion and Future Work 310

References 310

13 An Innovative Machine Learning Approach to Diagnose Cancer at Early Stage 313
Poongodi, P., Udayakumar, E., Srihari, K. and Sachi Nandan Mohanty

13.1 Introduction 314

13.2 Related Work 317

13.3 Materials and Methods 320

13.4 System Design 322

13.5 Results and Discussion 331

13.6 Conclusion 335

References 335

14 A Study of Human Sleep Staging Behavior Based on Polysomnography Using Machine Learning Techniques 339
Santosh Kumar Satapathy and D. Loganathan

14.1 Introduction 340

14.2 Polysomnography Signal Analysis 341

14.3 Case Study on Automated Sleep Stage Scoring 349

14.4 Summary and Conclusion 356

References 357

15 Detection of Schizophrenia Using EEG Signals 359
Shalini Mahato, Laxmi Kumari Pathak and Kajal Kumari

15.1 Introduction 360

15.2 Methodology 367

15.3 Literature Review 372

15.4 Discussion 372

15.5 Conclusion 388

References 388

16 Performance Analysis of Signal Processing Techniques in Bioinformatics for Medical Applications Using Machine Learning Concepts 391
G. Aparna, G. Anitha Mary and G. Sumana

16.1 Introduction 392

16.2 Basic Definition of Anatomy and Cell at Micro Level 397

16.3 Signal Processing - Genome Signal Processing 403

16.4 Hotspots Identification Algorithm 414

16.5 Results - Experimental Investigations 416

16.6 Analysis Using Machine Learning Metrics 418

16.7 Conclusion 424

Appendix 424

A.1 Hotspot Identification Code 424

A.2 Performance Metrics Code 425

References 427

17 Survey of Various Statistical Numerical and Machine Learning Ontological Models on Infectious Disease Ontology 431
Yuvaraj Natarajan, Srihari Kannan and Sachi Nandan Mohanty

17.1 Introduction 432

17.2 Disease Ontology 432

17.3 Infectious Disease Ontology 433

17.4 Biomedical Ontologies on IDO 434

17.5 Various Methods on IDO 435

17.6 Machine Learning-Based Ontology for IDO 436

17.7 Recommendation or Suggestions for Future Study 437

17.8 Conclusions 438

References 438

18 An Efficient Model for Predicting Liver Disease Using Machine Learning 443
Ritesh Choudhary, T. Gopalakrishnan, D. Ruby, A. Gayathri, Vishnu Srinivasa Murthy and Rishabh Shekhar

18.1 Introduction 444

18.2 Related Works 445

18.3 Proposed Model 446

18.4 Results and Analysis 454

18.5 Conclusion 456

References 456

Part 4 Bioinformatics and Market Analysis 459

19 A Novel Approach for Prediction of Stock Market Behavior Using Bioinformatics Techniques 461
Prakash Kumar Sarangi, Birendra Kumar Nayak and Sachidananda Dehuri

19.1 Introduction 462

19.2 Literature Review 463

19.3 Proposed Work 466

19.4 Experimental Study 470

19.5 Conclusion and Future Work 482

References 484

20 Stock Market Price Behavior Prediction Using Markov Models: A Bioinformatics Approach 485
Prakash Kumar Sarangi, Birendra Kumar Nayak and Sachidananda Dehuri

20.1 Introduction 486

20.2 Literature Survey 487

20.3 Proposed Work 488

20.4 Experimental Work 497

20.5 Conclusions and Future Work 504

References 505

Index 507

Authors

Rabinarayan Satpathy National Institute of Technology - Rourkela. Tanupriya Choudhury Deetya Soft Pvt. Ltd. Noida. Suneeta Satpathy Utkal University, Bhubaneswar, Odish. Sachi Nandan Mohanty IIT Kharagpur. Xiaobo Zhang Guangdong University of Technology, China.