patterns f or performance and operability
TRANSCRIPT
Patterns f or Performance
and Operability Building and Testing Enterprise Software
Chris Ford
Ido Gileadi
Sanjiv Purba
Mike Moerman
A Auerbach Publications Taylor &. Francis Group
Boca Raton New York
Auerbach Publications is an imprint of the Taylor & Francis Group, an informa business
Contents
Dedications v
The Purpose of This Book xv
Acknowledgments xvii
About the Authors xix
1 Introduction 1 Production Systems in the Real World 1
Case 1—The Case of the Puzzlingly Poor Performance 2 Case 2—The Case of the Disappearing Database 5
W h y Should I Read This Book? 7 The Non-Functional Systems Challenge 8 What Is Covered by Non-Functional Testing 9 Planning for the Unexpected 10 Patterns for Operability in Application Design 11
Ensuring Data and Transaction Integrity 11 Capturing and Reporting Exception Conditions in a Consistent
Fashion 11 Automated Recovery from Exception Conditions 14 Application Availability and Health 14
Summary 14
2 Planning and Project Initiation 17 The Business Case for Non-Functional Testing 17
What Should Be Tested 17
How Far Should the System Be Tested? 19 Justifying the Investment 20 Negative Reasoning 21
Scoping and Estimating 22 Determining the Scope of Non-Functional Testing 22 Estimating Effort and Resource 26 Estimating the Delivery Timeline 29
vii
viii • Contents
Test and Resource Planning 33 Test Types and Base Requirements 33 Test Environments 36 The Test Team 37
Communication Planning 39 Setting Expectations 39
Summary 40
3 Non-Functional Requirements 41 What Are Non-Functional Requirements? 43 Do I Need Non-Functional Requirements? 43 Roles and Responsibilities 44 Challenging Requirements 45 Establishing a Business Usage Model 46
Quantifying Human and Machine Inputs 46 Expressing Load Scenarios 54
Non-Functional Requirements 56 An Important Clarification 56 Performance Requirements 58 Operability Requirements 62 Availability Requirements 64
Archive Requirements 65 Summary 67
4 Designingfor Operability 69 Error Categorization 70
Design Patterns 71 Retry for Fault Tolerance 71 Software Fuses 74
Software Valves 75 System Health Checks 78
The Characteristics of a Robust System 80 Simple Is Better 80 Application Logging 81 Transparency: Visibility into System State 83 Traceability and Reconciliation 84 Resume versus Abort 86 Exception Handling 87
Infrastructure Services 91 Design Reviews 91
The Design Checklist 91 The Operability Review 92 Summary 94
Contents • ix
5 Designing for Performance 95 Requirements 95
Hie "Ilities" 95 Architecture 101
Hotspots 101 Patterns 102
Divide and Conquer 102 Load Balancing 102 Parallelism 103 Synchronous versus Asynchronous Execution 107 Caching 109
Antipatterns 112 Overdesign 114 Overserialization 114 Oversynchronization 117 User Session Memory Consumption 118
Algorithms 119
Technology 120 Programming Languages 120 Distributed Processing 123 XML 125
Software 126 Databases 127 Application Servers 129 Messaging Middleware 129 ETLs 132
Hardware Infrastructure 134 Resources 134
Summary 136
6 Test Planning 139 Defining Your Scope 140
System Boundaries 140 Scope of Operability 142
Scope of Performance 145 Load Testing Software 145
Product Features 146 Vendor Products 147
Additional Testing Apparatus 149 TestBeds 150
Test-Case Data 150 Test Environments 151
Isolation 151
x • Contents
Capacity 153 Change Management 154
Historical Data 154 Summary 157
7 Test Preparation and Execution 159 Preparation Activities 159
Script Development 160 Validating the Test Environment 164 Establishing Mixed Load 164 Seeding the Test Bed 167 Tuning the Load 167
Performance Testing 171 Priming Effects 172 Performance Acceptance 173 Reporting Performance Results 176 Performance Regression: Baselining 177 Stress Testing 181
Operability Testing 181 Boundary Condition Testing 182 Failover Testing 183 Fault Tolerance Testing 186
Sustainability Testing 188
Challenges 192 Repeatable Results 193 Limitations 193
Summary 194
8 Deployment Strategies 195 Procedure Characteristics 196 Packaging 197
Configuration 197 Deployment Rehearsal 198
Rollout Strategies 198 The Pilot Strategy 198 The Phased Rollout Strategy 199 The Big Bang Strategy 199 The Leapfrog Strategy 200
Case Study: Online Banking 200 Case Study: The Banking Front Office 202 Back-Out Strategies 204
Complete Back-Out 204 Partial Back-Out 204
Contents • xi
Logical Back-Out 204 Summary 205
9 Resisting Pressure from the Functional Requirements Stream 207 A Question of Degree 208 Pressures from the Functional Requirements Stream 209 Attention 212
Human Resources 212 Hardware Resources 213 Software Resources 213 Issue Resolution 213
Defining Success 213 Setting the Stage for Success 214
Framework 215
Roles and Responsibilities 216 Raw Resources Required by the Non-Functional Requirements
Stream 216 Performance Metrics 221 Setting Expectations 222 Controls 222 The Impact of Not Acting 223
Summary 223
1 0 Operations Trending and Monitoring 225 Monitoring 225
Attributes of Effective Monitoring 227
Monitoring Scope 228 Infrastructure Monitoring 230 Container Monitoring 233 Application Monitoring 238 End-User Monitoring 239
Trending and Reporting 241 Historical Reporting 241 Performance Trending 241
Error Reporting 243 Reconciliation 244 Business Usage Reporting 245
Capacity Planning 245
Planning Inputs 245 Best Practice 248 Case Study: Online Dating 248 Maintaining the Model 255 Completing a Capacity Plan 255
xii • Contents
Summary 256
11 Troubleshooting and Crisis Management 257 Reproducing the Issue 257 Determining Root Cause 258 Troubleshooting Strategies 259
Understanding Changes in the Environment 259 Gathering All Possible Inputs 261 Approach Based on Type of Failure 263 Predicting Related Failures 265 Discouraging Bias 268 Pursuing Parallel Paths 268 Considering System Age 269 Working Around the Problem 269
Applying a Fix 270 Fix versus Mitigation versus Tolerance 270 Assessing Level of Testing 271
Post-Mortem Review 272
Reviewing the Root Cause 272 Reviewing Monitoring 272
Summary 275
12 Common Impediments to Good Design 277 Design Dependencies 277 What Is the Definition of Good Design? 279
What Are the Objectives of Design Activities? 279 Rating a Design 281
Testing a Design 286 Contributors to Bad Design 287
Common Impediments to Good Design 287 Confusing Architecture with Design 288 Insufficient Time/Tight Timeframes 288 Missing Design Skills on the Project Team 288 Lack of Design Standards 288
Personal Design Preferences 289 Insufficient Information 289 Constantly Changing Technology 289 Fad Designs 290 Trying to Do Too Much 290 The 80/20 Rule 290 Minimalistic Viewpoint 290 Lack of Consensus 291 Constantly Changing Requirements 291
Contents • xiii
Bad Decisions/Incorrect Decisions 291 Lack of Facts 291 External Impacts 291 Insufficient Testing 291 Lack of Design Tools 292 Design Patterns Matter 292 Lack of Financial Resources 292
Design Principles 292 Summary 293
References 295
Index 297