+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

GenAI on AWS. A Practical Approach to Building Generative AI Applications on AWS. Edition No. 1. Tech Today

  • Book

  • 368 Pages
  • April 2025
  • John Wiley and Sons Ltd
  • ID: 5979363

The definitive guide to leveraging AWS for generative AI

GenAI on AWS: A Practical Approach to Building Generative AI Applications on AWS is an essential guide for anyone looking to dive into the world of generative AI with the power of Amazon Web Services (AWS). Crafted by a team of experienced cloud and software engineers, this book offers a direct path to developing innovative AI applications. It lays down a hands-on roadmap filled with actionable strategies, enabling you to write secure, efficient, and reliable generative AI applications utilizing the latest AI capabilities on AWS.

This comprehensive guide starts with the basics, making it accessible to both novices and seasoned professionals. You'll explore the history of artificial intelligence, understand the fundamentals of machine learning, and get acquainted with deep learning concepts. It also demonstrates how to harness AWS's extensive suite of generative AI tools effectively. Through practical examples and detailed explanations, the book empowers you to bring your generative AI projects to life on the AWS platform.

In the book, you'll:

  • Gain invaluable insights from practicing cloud and software engineers on developing cutting-edge generative AI applications using AWS
  • Discover beginner-friendly introductions to AI and machine learning, coupled with advanced techniques for leveraging AWS's AI tools
  • Learn from a resource that's ideal for a broad audience, from technical professionals like cloud engineers and software developers to non-technical business leaders looking to innovate with AI

Whether you're a cloud engineer, software developer, business leader, or simply an AI enthusiast, Gen AI on AWS is your gateway to mastering generative AI development on AWS. Seize this opportunity for an enduring competitive advantage in the rapidly evolving field of AI. Embark on your journey to building practical, impactful AI applications by grabbing a copy today.

Table of Contents

Acknowledgments xv

About the Authors xvii

Foreword xix

Introduction xxi

Chapter 1: A Brief History of AI 1

The Precursors of the Mechanical or “Formal” Reasoning 2

The Digital Computer Era 4

Cybernetics and the Beginning of the Robotic Era 6

Birth of AI and Symbolic AI (1955-1985) 10

Subsymbolic AI Era (1985-2010) 14

Deep Learning and LLM (2010-Present) 16

Key Takeaways 17

Chapter 2: Machine Learning 19

What Is Machine Learning? 19

Types of Machine Learning 20

Supervised Learning 21

Unsupervised and Semi-Supervised Learning 22

Reinforcement Learning 23

Methodology for Machine Learning 24

Implementation of Machine Learning 26

Machine Learning Applications 27

Natural Language Processing (NLP) 27

Computer Vision 27

Recommender System 27

Predictive Analytics 28

Fraud Detection 28

Machine Learning Frameworks and Libraries 28

TensorFlow 28

PyTorch 31

Scikit-learn 34

Keras 35

Apache Spark MLlib 37

Future Trends in Machine Learning 40

Rise of Edge Computing and Edge AI 40

Convergence with Emerging Technologies 40

Advancements in Unsupervised Learning,

Reinforcement Learning, and Generative Models 41

Increased Specialization and Customization 41

Explainable and Trustworthy AI 42

Key Takeaways 42

References 43

Chapter 3: Deep Learning 45

Deep Learning vs. Machine Learning 45

Computer Vision Example 46

Natural Language Processing Example 47

The History of Deep Learning 47

Understanding Deep Learning 52

Neurons 52

Weights and Biases 54

Layers 54

Activation Function(s) 55

An Introduction to the Perceptron 58

Overcoming Perceptron Limitations 59

FeedForward Neural Networks 60

Backpropagation 60

Parameters vs. Hyperparameters 60

Hyperparameters in Artificial Neural Networks 64

Loss Functions - a Measure of Success of a Neural Network 64

Optimization Algorithms 64

Neural Network Architectures 68

Putting It All Together 71

Deep Learning on AWS 71

Chipsets and EC2 Instances 71

AWS P5 Instances 72

AWS Inferentia 72

Amazon Elastic Inference 73

Prebuilt Containers: Deep Learning AMIs and Containers 74

Deep Learning AMIs 74

Deep Learning Containers 74

Managed Services for Building, Training, and Deployment 74

Pre-trained Services 75

Key Takeaways 77

References 77

Chapter 4: Introduction to Generative AI 79

Generative AI Core Technologies 80

Neural Networks 80

Generative Adversarial Networks (GANs) 80

Variational Autoencoders (VAEs) 81

Recurrent Neural Networks (RNNs) and

Long Short-Term Memory Networks (LSTMs) 82

Limitations of Recurrent Neural Networks 84

Transformer Models 85

Self-Attention 86

Parallelism 86

Diffusion Models 86

Autoregressive Models 87

Reinforcement Learning (RL) 87

Transfer Learning and Fine-Tuning 87

Optimization Algorithms 87

Transformer Architecture: Deep Dive 87

Deep Dive 89

Step 1: Tokenization (Preprocessing) 89

Step 2: Embedding 89

Step 3: Encoder 92

Step 4: Encoder Output to Decoder Input 97

Step 5: Decoder 98

Step 6: Translation Generation 99

Step 7: Detokenization 99

Terminology in Generative AI 99

Prompt 104

Inference 105

Context Window 106

Prompt Engineering 106

In-Context Learning (ICL) 107

Zero-Shot/One-Shot/Few-Shot Inference 108

Inference Configuration 109

Maximum Length 110

Diversity (Top P/Nucleus Sampling) 111

Top K 111

Randomness (Temperature) 112

System Prompts 112

Prompt Engineering 113

Key Elements of a Prompt 113

Designing Effective Prompts 114

Prompting Techniques 115

Zero-Shot Prompting 115

Few-Shot Prompting 115

Chain-Of-Thought Prompting 116

Advanced Prompting Techniques 117

Self-Consistency 118

Tree of Thoughts (ToT) 119

Retrieval-Augmented Generation (RAG) 120

Automatic Reasoning and Tool-Use (ART) 122

ReAct Prompting 123

Coherence Enhancement 124

Progressive Prompting 126

Handling Prompt Misuse 127

Prompt Injection 127

Prompt Leaking 128

Mitigating Bias 129

Mitigating Bias in Prompt Engineering 130

Generative AI Business Value 133

Building Value Within Your Enterprises 135

Technology: Creating a Flexible and Strong System 135

People: Training and Adapting the Team 135

Processes: Good Management and Fair Use of AI 136

Why a Solid Foundation Is Crucial 136

References 137

Chapter 5: Introduction to Foundation Models 139

Definition and Overview of Foundation Models 139

Characteristics of Foundation Models 142

Examples of Foundation Models 144

Types of Foundation Models 147

The Large Language Model (LLM) 154

Natural Language Processing 155

Early Approaches to NLP 156

Evolution toward Text-Based Foundation Model 160

Applications of Foundation Models 162

Challenges and Considerations 163

Infrastructure 163

Ethics 164

Areas of Evolution 165

Key Takeaways 167

References 168

Chapter 6: Introduction to Amazon SageMaker 169

Data Preparation and Processing 172

Data Preparation 172

Data Processing 173

Model Development 174

Model Training and Tuning 175

Model Deployment 177

Model Management 178

Security 179

Compliance and Governance 180

Model Explainability and Responsible AI 181

MLOps with Amazon SageMaker 181

Boost Your Generative AI Development with

SageMaker JumpStart 182

No-Code ML with Amazon SageMaker Canvas 182

Amazon Bedrock 184

Choosing the Right Strategy for the Development of

Your Generative AI Application with Amazon SageMaker 186

Conclusion 187

References 188

Chapter 7: Generative AI on AWS 191

AWS Services for Generative AI 192

Generative AI Trade-Off Triangle 192

How AWS Solves the Generative AI Trade-Off Triangle 192

Generative AI on AWS: The Fundamentals 193

Infrastructure for FM Training and Inference 194

Models and tools to build Generative AI Apps 194

Applications to boost productivity 195

Amazon Bedrock 196

Foundation Models with Bedrock 197

AI21 Labs - Jurassic 197

Amazon Titan 198

Anthropic’s Claude 3 199

Cohere’s Family of Models 201

Key Features of Cohere 201

Cohere Models on Amazon Bedrock 203

Meta’s Family of Models - Llama 204

When to Use Which Model 207

Mistral’s Family of Models 208

When to Use Which Model 209

Stability.ai’s Family of Models - Stable Diffusion XL 1.0 209

Poolside Family of Models 210

Luma’s Family of Models 211

Amazon’s Nova Family of Models 212

Model Evaluation in Amazon Bedrock 213

Common Approaches to Customizing Your FMs 214

Amazon Bedrock Prompt Management 214

Amazon Bedrock Flows 216

Data Automation in Amazon Bedrock 219

GraphRAG in Amazon Bedrock 220

Knowledge Bases in Amazon Bedrock 222

How Knowledge Bases Work 223

Pre-Processing Data 224

Runtime Execution 224

Creating a Knowledge Base in Amazon Bedrock 225

Agents for Amazon Bedrock 225

How Agents Work 226

Components of an Agent at Build Time 226

Components of an Agent at Runtime 228

Guardrails for Amazon Bedrock 230

Security in Amazon Bedrock 231

Amazon Q 232

Amazon Q Business 232

Amazon Q in QuickSight 235

Amazon Q Developer 237

Amazon Q Connect 239

Amazon Q in AWS Supply Chain 240

Summary 241

Chapter 8: Customization of Your Foundation Model 243

Introduction to LLM Customization 244

Continued Pre-Training (Domain Adaptation Fine-Tuning) 244

Fine-Tuning 245

Prompt Engineering 245

Retrieval Augmented Generation (RAG) 246

Choosing Between These Customization Techniques 246

Cost of Customization 249

Customizing Foundation Models with AWS 250

Continuous Pre-Training with Amazon Bedrock 250

Creation of a Training and a Validation Dataset 250

Launch of a Continued Pre-Training Job 251

Analysis of Our Results and Adjustment of

Our Hyperparameters 252

Deployment of Our Model 254

Use Your Customized Model 255

Instruction Fine-Tuning with Amazon Bedrock 257

Instruction Fine-Tuning with Amazon SageMaker JumpStart 257

Conclusion 260

Chapter 9: Retrieval-Augmented Generation 263

What Is RAG? 263

Background and Motivation 264

Overview of RAG 266

Building a RAG Solution 269

Design Considerations 269

Best Practices 270

Common Patterns 271

Performance Optimization 271

Scaling Considerations 272

The Future of RAG Implementations 273

Retrieval Module 274

Retrieval Techniques and Algorithms 276

Augmentation Module 278

Generation Module 280

RAG on AWS 282

Custom Data Pipeline to Build RAG 284

Core Components of a RAG Pipeline 284

Implementation Approaches 286

Basic Solution: LangChain Implementation 286

Advanced Solution: Spark-Based Pipeline 287

Data Ingestion (Examples) 288

Parallel Processing (example) 289

Case Studies and Applications 290

Question-Answering Systems 290

Dialogue Systems 290

Knowledge-Intensive Tasks 291

Implementation Considerations and Best Practices 291

Challenges and Future Directions 292

Example Notebooks 293

References 293

Chapter 10: Generative AI on AWS Labs 295

Lab 1: Introduction to Generative AI with Bedrock 295

Option 1: PartyRock Prompt Engineering Guide

(for Non-Technical and Technical Audiences) 297

Option 2: Amazon Bedrock Labs (for Technical Audiences) 298

Overview of Amazon Bedrock and Streamlit 298

Supported Regions 298

Costs When Running from Your Own Account 298

Quotas When Running from Your Own Account 299

Time to Complete 299

Lab 2: Dive Deep into Gen AI with Amazon Bedrock 299

Lab 3: Building an Agentic LLM Assistant on AWS 300

What Is an Agentic LLM Assistant? 300

Why Build an Agentic LLM Assistant? 301

About This Workshop 301

Architecture 301

Labs 302

Lab 4: Retrieval-Augmented Generation Workshop 303

Managed RAG Workshop 304

Naive RAG Workshop 304

Advance RAG Workshop 304

Audience 304

Lab 5: Amazon Q for Business 304

Next Steps 307

Lab 6: Building a Natural Language Query Engine for Data Lakes 308

Reference 310

Chapter 11: Next Steps 311

The Future of Generative AI: Key Dimensions and

Staying Informed 311

Technical Evolution and Capabilities 312

The Evolution of Scale and Architecture 312

The Multimodal Revolution 312

The Efficiency Breakthrough 313

The Context Window Revolution 313

Real-time Processing and Generation 313

The Future Technological Landscape 314

Application Domains 314

Enterprise Applications: The Quiet Revolution 315

The Scientific Frontier: Accelerating Discovery 315

Healthcare: Personalized Medicine and Diagnosis 315

Education and Training: Personalizing Learning 316

Environmental Applications: Tackling Global Challenges 316

The Future of Applications 317

Ethical and Societal Implications 317

Digital Identity and Deep Fakes: The Crisis of Trust 318

Labor Markets and Economic Disruption 318

Privacy and Data Rights in the Age of AI 318

Bias and Fairness: The Hidden Challenges 319

Democratic Access and Digital Divides 319

Environmental and Sustainability Concerns 319

The Path Forward: Governance and Responsibility 319

Looking to the Future 320

Staying Current in the Rapidly Evolving AI Landscape 320

Glossary 323

Index

Authors

Olivier Bergeret Asif Abbasi Joel Farvault