plab system owners meeting v2
TRANSCRIPT
Goal
pLab
What is PLab
What is Performance
Approach/Strategy
How to use pLab
To answer four questions
pLab – History, Context and Strategy
2007:
Inception
2008:
Executi
on
2009:
Growth
and Maturity
2010 and
beyond:
Expansion
EE
System Engineerin
g
Performance Lab
Software Infrastruct
ure Engineerin
g
Security Engineerin
g
Institutionalize performance engineering for critical products Promote performance awareness across the enterprise Educate, mentor and consult with product teams Benchmark new technologies and hardware Build shared performance testing environment, foster performance
testing
What is Performance?
Response time Stability Scalability Efficiency
Performance is the ability to which a software system or software component meets its objectives for response time, stability,
scalability and resource consumption
Performance of large-scale systems is a make-or-break quality. The cost of failure…
Increased operational costIncreased development costIncreased hardware costCanceled projectsDamaged customer relationsLost incomeReduced competitiveness
Source:SPE (Lloyd G. Williams & Connie U. Smith)
Performance Engineering gives us increased uptime
Air Services
Target 99.66%
Customer Access
Target 99.83%
2009 2010
What is Performance Engineering?
The art and science of quantitatively measuring, understanding and tuning the latency, throughput, and utilization of computer systems
Performance testing is the most integral part of PE to uncover performance problems
Performance Testing
Analyze
Optimize
Assess Risk
Plan
• Environment Setup• Test Harness Creation• Test Execution• Test Report Generation
What is Performance Testing?
Performance testing
Load test Soak test Destructive test
Impulse test
Resiliency test
Capacity impact test
Impulse Testing
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 490
20
40
60
80
100
120
140
160
Peak Load
3XPeak Load
0.2XPeak Load
Peak Load
pLab Environment
Dedicated, isolated environment in CyrusOne, Lewisville Hardware configuration for various applications is same as production Environment includes
F5 SAN and NAS capabilities Fabric components (MOM, USG, SSG, NOFEP, BBIS, ICE
Simulator) ATSE components (Intellisell, MIP, IS, ASv2, DSSv2, Pricing,
Oracle, Instrumentation Database, App Console) Load Runner and custom built load drivers Automated performance testing, monitoring and data collection
framework
Performance Testing Lifecycle (showing typical break-up of activities for consulting engagements)
14
Performance and
Reliability Goal
Setting
Establishing Test
Environment
Test Planning
Test Harness
Preparation Test Execution
Performance
Optimization
Release Signoff
Document
Response Time Requirements
Hardware Acquisition
Load, Stress and Soak Test Requirements Load drivers Setup data
Analysis of Test Results
Go / No-go decision
Workload Characterizatio
n
Platform Setup - OS,
InfrastructureTest Harness
DesignMocks /
Simulators
Run planned tests and collect
results Tuning
Document Availability
RequirementsApplication
Setup
Hardware, middleware
and application monitoring Generate Report
Document Service Level Expectations
(GC, Error rate etc.)
Database Setup
Reporting / Charting / Data
visualization
Possible use of Flex Lab
• pLab monitoring framework• pLab mock framework
Some support on Tuning available
pLab Application Team
Performance Engineering Test Plan
Understand Business/Applicati
on Needs
Analyze Results, Report and Retest
Execute the Tests Identify Test Environment
Identify Acceptance
Criteria
Plan and Design Tests
What to Measure?
Load Driver
Response Time
Throughput
Error Types & Percentages
Application
Application Metrics
JVM
ESSM/ CLR Metrics
Database
Connections
Sessions
Errors
Resource Utilization
System
CPU
Memory
Network
Disk
pLab Monitoring Framework
Yaketystats
Sample usage: http://plabptl020.dev.sabre.com/yaketystats/jart/index.php?pl=IndividualServers/plab202
Collector – is the client. That collects stats and sends them to the server.
Stuffer – is the server that accepts stats from the client and puts them in a file system. Once every 5 minutes it puts “stuffs” these stats to the RRD file.
Performance Engineering Test Plan
Understand Business/Applicati
on Needs
Analyze Results, Report and Retest
Execute the Tests Identify Test Environment
Identify Acceptance
Criteria
Plan and Design Tests
Performance Engineering Test Plan
http://wiki.sabre.com/confluence/display/EOP/Performance
pLab PET Plans
Performance Tuning
OS tuning• System library tuning• Kernel tuning• TCP/IP tuning
JVM tuning• Garbage collection tuning
Application tuning• Profiling• Memory allocation tuning• Thread contention• Algorithm optimization
Performance Lab Service Model
Performance Testing• Release testing• Project centered
testing• Ad hoc testing• Flex Lab
Benchmarking• Cookie Cutter
architecture• Appliances• Software
Optimization/ Tuning• Code
optimization• Profiling• Platform tuning
• OS tuning• JVM tuning
Performance Oriented Design/ Consulting• PE Planning and
Test Harness Design
• Architecture review• Patterns• Anti patterns
Triaging applications based on both Business Criticality and Technical Quality to address the urgent and the important
22
Category A: Highest focus area and pLab takes ownership for “Reliability Growth” through performance testing
Category B: Closely monitored and supported by pLab
Category C: Lower priority applications
Low
------------
-Business Criticality------------------> High
A A B
A B C
C C
Low -------------Technical Quality------------------> High
Air Crews
Movement Manager
Rev Accounting
CentivaSSCI (Kiosk, Web)
Schedule Manager
SSW2Crew Control
Illustrative
Flightline
Load Manager
Airline Cutover Support
Solution Review
Planning
Execution
Post Cutover Support
• Engineering / Capacity Planning Reviews• Joint Performance and E2E engineering risk assessment
• Deliver Performance Engineering Test (PET) Plan• Plan for 4 rounds of performance testing in CERT• Identify high risk Ops products
• Create test harness for every injection point on critical path• Execute tests, gather metrics for all systems, analyze• Certify release (Go/NoGo)
• Monitor production performance• Repeat performance tests if necessary