+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)
New

Handbook of Statistical Analysis. AI and ML Applications. Edition No. 3

  • Book

  • December 2024
  • Elsevier Science and Technology
  • ID: 5947813

Handbook of Statistical Analysis: AI and ML Applications, third edition, is a comprehensive introduction to all stages of data analysis, data preparation, model building, and model evaluation. This valuable resource is useful to students and professionals across a variety of fields and settings: business analysts, scientists, engineers, and researchers in academia and industry. General descriptions of algorithms together with case studies help readers understand technical and business problems, weigh the strengths and weaknesses of modern data analysis algorithms, and employ the right analytical methods for practical application. This resource is an ideal guide for users who want to address massive and complex datasets with many standard analytical approaches and be able to evaluate analyses and solutions objectively. It includes clear, intuitive explanations of the principles and tools for solving problems using modern analytic techniques; offers accessible tutorials; and discusses their application to real-world problems.

Table of Contents

Part I Introduction
1. Historical Background to Analytics
2. Theory
3. Data Mining and Predictive Analytic Process
4. Data Science Tool Types: Which one is Best?

Part II Data Preparation
5. Data Access
6. Data Understanding
7. Data Visualization
8. Data Cleaning
9. Data Conditioning
10. Feature Engineering
11. Feature Selection
12. Data Preparation Cookbook

Part III Modeling

13. Algorithms
14. Modeling
15. Model Evaluation and Enhancement
16. Ensembles & Complexity
17. Deep Learning vs. Traditional ML
18. Explainable AI (XAI) put after Deep Learning
19. Human in the Loop

Part IV Applications
20. GENERAL OVERVIEW of an Application Healthcare Delivery and Medical Informatics
21. Specific Application: Business: Customer Response
22. Specific Application: Education: Learning Analytics
23. Specific Application: Medical Informatics: Colon Cancer Screening
24. Specific Application: Financial: Credit Risk
25. Specific FUTURE Application: The ‘INTELLIGENCE AGE (Revolution)’: LLMs like ChatGPT Tiny ML H.U.M.A.N.E. - Etc.

Part V Right Models Luck - & Ethics of Analytics
26. Right Model for the Right Use
27. Ethics in Data Science
28. Significance of Luck

Part VI Tutorials and Case Studies
Tutorial A Example of Data Mining Recipes Using Statistica Data Miner 13
Tutorial B Analysis of Hurricane Data (Hurrdata.sta) Using the Statistica Data Miner 13
Tutorial C Predicting Student Success at High-Stakes Nursing Examinations (NCLEX) Using SPSS Modeler and Statistica Data Miner 13
Tutorial D Constructing a Histogram Using MidWest Company Personality Data Using KNIME
Tutorial E Feature Selection Using KNIME
Tutorial F Medical/Business Tutorial Using Statistica Data Miner 13
Tutorial G A KNIME Exercise, Using Alzheimer’s Training Data of Tutorial F (RAN note: This tutorial refers to the data used in Tutorial I, and it should be changed to refer to Tutorial F. I propose a new title: Tutorial G Medical/Business Tutorial with Tutorial F Data Using KNIME.
Tutorial H Data Prep 1-1: Merging Data Sources Using KNIME
Tutorial I Data Prep 1-2: Data Description Using KNIME
Tutorial J Data Prep 2-1: Data Cleaning and Recoding Using KNIME
Tutorial K Data Prep 2-2: Dummy Coding Category Variables Using KNIME
Tutorial L Data Prep 2-3: Outlier Handling Using KNIME
Tutorial M Data Prep 3-1: Filling Missing Values With Constants Using KNIME
Tutorial N Data Prep 3-2: Filling Missing Values With Formulas Using KNIME
Tutorial O Data Prep 3-3: Filling Missing Values With a Model Using KNIME

Back Matter:
Appendix-A Listing of TUTORIALS and other RESOUCES on this book’s COMPANION WEB PAGE
Appendix B Instructions on HOW TO USE this book’s COMPANION WEB PAGE

Authors

Robert Nisbet Researcher-Medical Informatics, H.H. Chao Comprehensive Digestive Disease Center, University of California Irvine Medical Center, Private Consulting, Santa Barbara, CA, USA. Bob Nisbet, PhD, is a Data Scientist, currently modeling precancerous colon polyp presence with clinical data at the UC-Irvine Medical Center. He has experience in predictive modeling in Telecommunications, Insurance, Credit, Banking. His academic experience includes teaching in Ecology and in Data Science. His industrial experience includes predictive modeling at AT&T, NCR, and FICO. He has worked also in Insurance, Credit, membership organizations (e.g. AAA), Education, and Health Care industries. He retired as an Assistant Vice President of Santa Barbara Bank & Trust in charge of business intelligence reporting and customer relationship management (CRM) modeling. Gary D. Miner CEO, M&M Predictive Analytics LLC; UCI Adjunct Professor for Continuing Education, Predictive Analytics Program; Associate Editor, The Journal of Geriatric Psychiatry and Neurology; Private Consulting, Tulsa, OK, USA. Dr. Gary Miner PhD received a B.S. from Hamline University, St. Paul, MN, with biology, chemistry, and education majors; an M.S. in zoology and population genetics from the University of Wyoming; and a Ph.D. in biochemical genetics from the University of Kansas as the recipient of a NASA pre-doctoral fellowship. He pursued additional National Institutes of Health postdoctoral studies at the U of Minnesota and U of Iowa eventually becoming immersed in the study of affective disorders and Alzheimer's disease.

In 1985, he and his wife, Dr. Linda Winters-Miner, founded the Familial Alzheimer's Disease Research Foundation, which became a leading force in organizing both local and international scientific meetings, bringing together all the leaders in the field of genetics of Alzheimer's from several countries, resulting in the first major book on the genetics of Alzheimer's disease. In the mid-1990s, Dr. Miner turned his data analysis interests to the business world, joining the team at StatSoft and deciding to specialize in data mining. He started developing what eventually became the Handbook of Statistical Analysis and Data Mining Applications (co-authored with Drs. Robert A. Nisbet and John Elder), which received the 2009 American Publishers Award for Professional and Scholarly Excellence (PROSE). Their follow-up collaboration, Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications, also received a PROSE award in February of 2013. Gary was also co-author of "Practical Predictive Analytics and Decisioning Systems for Medicine (Academic Press, 2015). Overall, Dr. Miner's career has focused on medicine and health issues, and the use of data analytics (statistics and predictive analytics) in analyzing medical data to decipher fact from fiction.

Gary has also served as Merit Reviewer for PCORI (Patient Centered Outcomes Research Institute) that awards grants for predictive analytics research into the comparative effectiveness and heterogeneous treatment effects of medical interventions including drugs among different genetic groups of patients; additionally he teaches on-line classes in 'Introduction to Predictive Analytics', 'Text Analytics', 'Risk Analytics', and 'Healthcare Predictive Analytics' for the University of California-Irvine. Recently, until 'official retirement' 18 months ago, he spent most of his time in his primary role as Senior Analyst-Healthcare Applications Specialist for Dell | Information Management Group, Dell Software (through Dell's acquisition of StatSoft (www.StatSoft.com) in April 2014). Currently Gary is working on two new short popular books on 'Healthcare Solutions for the USA' and 'Patient-Doctor Genomics Stories'. Keith McCormick Consultant, USA. Keith McCormick is a highly accomplished professional consultant, mentor, and trainer, having served as keynote and moderator at international conferences focused on analytic practitioners and leadership alike. Keith has leveraged statistical software since 1990 along with deep expertise utilizing popular industry advanced analytics solutions such as IBM SPSS Statistics, IBM SPSS Modeler, KNIME, popular open-source and other tools involving text and big data analytics.