sfeldman performance bb_worldemea07
Post on 27-Jan-2015
105 Views
Preview:
DESCRIPTION
TRANSCRIPT
Principles of Performance Engineering
Steve Feldman, Director Performance Engineering and Architecturesfeldman@blackboard.com
Agenda
• Questions of the Mind…• PE @ Bb• Process and Methodology• What projects are coming out of the PerfEng Lab in ’07
• Links
Part 1: Questions of the MindPart 1: Questions of the Mind
Questions of the Mind…• What is Performance?• What is Scalability?• What is Performance Engineering?• Why does Blackboard invest in Performance Engineering?
What is Performance?
• Performance = Response Times• Response Times affect the User Experience.
• If the User Experience is acceptable Abandonment becomes less likely which can positively affect Adoption.
• When Adoption increases, greater the need for Scalability.
What is Scalability?
• Scalability = Heavy Adoption/Usage – If Heavy Adoption = Positive User Experience
• If Positive User Experience = Acceptable Response Times
• Patterns of predictable usage– Present Population– User Behavior
• Unpredictable Behavior and New Populations.– Make the unpredictable predictable.
• Changes in Adoption Growth Behavior
What is Performance Engineering?
• Engineering discipline with a primary focus on performance and scalability.
• Combination of Software and System activities with intent of improving Performance and Scalability.
• Performance Design Performance Development Performance Verification Performance Benchmarking
Why does Blackboard invest in Performance Engineering?• Some of the world’s largest application deployments outside of commercial portals and e-commerce sites are Blackboard.
• Desire to add more applications, sub-systems and features to drive adoption and increase growth.
• Differentiate Blackboard from other LMS/CMS ISVs and the Open Source Community.
Part 2: PE @ BbPart 2: PE @ Bb
PerfEng at Bb…• Triangle of Priorities
– Performance: Response Times (Apdex)– Scalability: High Session Activity with Low Abandonment (PAR)
– User Experience: High Performance and Heavy Integration (Reference Architecture)
• Investment and Relationships– Software Tool Set– Lab Sponsors
Triangle of Priorities
PerfEng at Bb…
RequirementDevelopment
FunctionalTesting
GeneralAvailability
RegressionTesting
CertificationIntegrated Testing
DevelopDesign
PerformanceBenchmarking
- Assess Performance Risk- Mitigate Performance Risk- Identify Critical Use Cases for Analysis
- Performance Workbooks- Review Technical Design Document- Reference acceptable design patterns- Warn about unacceptable anti-patterns- Model/Prototype
- Baseline as functionality can be tested- Profile for inefficient calls/executions- Identify scalability issues in time to refactor
- High-Watermark Load Testing- Common Scenario Load Testing- Conditional Scenario Load Testing- Java Performance Scenarios
- Platform Configurations- Advanced Configurations- Partner Benchmarking (Vendor Kits)
- Sizing and Capacity Guidance
End to End Performance Integration in the Blackboard Software Development Lifecycle
- Performance Verification- Focus Verification- QA Cyclical Requirements
Performance: Response Times (Apdex)
• Every transaction in the application is defined as a state, an action state or an action.
• Each transaction is assigned a ranking: critical, essential or trivial.
• Each transaction is assigned an abandonment policy: low, medium, high and very high.– Abandonment represents (4) dimensions of Apdex.
State/Action Modeling• State: A condition or point of reference within a sub-system in which an actor has the option to move to a sub-state, perform an action or move to a super-state. – States are often considered navigation items easily identified by a bread crumb.
– States can also be considered pages if and only if a sub-state or action is branched from the state. • Example: Discussion Board Forum List Widget (State of the Discussion Board sub-system)
State/Action Modeling
• Action State: The navigation to an action. – For example, an instructor wants to create a message for a topic. The instructor selects Add Message and is brought to a page that requires the user to input information followed by a submit to actually create the message.
State/Action Modeling
• Action: An actor driven process that occurs within a state. – Actions occur when an actor can not move into a sub-states.
– Most often associated with a use case. •Example: Replying to a thread within a Discussion Board Message.
What is Apdex(Application Performance Index)?
• What is Apdex– Apdex is an open standard developed by an alliance of companies that defines a standardized method to report, benchmark, and track application performance.
• http://www.apdex.org
Apdex and User Abandonment
• Bb defines (4) Apdex Dimensions: – Low (2s-8s)– Medium (5s-20s)– High (8s-32s) – Very High (12s-48s)
Apdex and Workload Variations• Apdex scores are taken across each data model variation.• Expect scores of 85% or higher for under-loaded systems.• During Verification Testing and Benchmarks, expect
scores of 75% or higher.• If scores return below accepting, move to
instrumentation and profiling (Method-R)
Scalability: High Session Activity with Low Abandonment (PAR)
• What is a PAR?– Performance Archetype Ratio– Scoring method to determine resource requirements of a deployment based on given system workload.
• Any component of the deployment can have a PAR score.
Scalability: High Session Activity with Low Abandonment (PAR)
X-Axis: IterationsX-Axis: Iterations
Y-A
xis
: R
esou
rce U
tiliza
tion
Y-A
xis
: R
esou
rce U
tiliza
tion
Resource Utilization Threshold LineResource Utilization Threshold Line
Optimal WorkloadOptimal Workload
CPCPUU
PAR Process: Step 1 Calibration
• Calibrate Workloads with User Abandonment– Peak of Concurrency (POC): The virtual user workload in which response times are acceptable and the highest volume of virtual users are park of the scenario.
– Level of Concurrency (LOC): The virtual user workload in which response times are acceptable the steadiest volume of virtual users are participating in the scenario.
– Average Concurrency: The average workloads of the POC and LOC measurements combined.
PAR Process: Step 2 App. Saturation
• Take workloads from abandonment run and disable abandonment.– Run based on Peak of Concurrency workload (Abandonment
Disabled) – Run based on Level of Concurrency workload (Abandonment
Disabled) – Run based on Average of Concurrency workload (Abandonment
Disabled)• Example Metrics
– Response times consistently lower then ~5 seconds – Application CPU saturation close to X > 90% where X = CPU
utilization + (1) Standard Deviation of the CPU Utilization – Total Sessions – Total Transactions – Application Server Hits Per Second – Database CPU saturation
• Strategies– Clustering and Virtualization
PAR Process: Step 3 DB Saturation
• Multiply the workload from Step 2 across identical application servers.– Typically want 90% CPU utilization and sub-5
second response times.• Example Metrics
– Database CPU saturation close to X > 80% where X = CPU utilization + (1) Standard Deviation of the CPU Utilization
– Memory Utilization– Database Shadow Processes– I/O operations per second
• Strategies– Increase CPU speed and count– Optimize storage configuration
PAR Process: Additional Steps
• Hypothesis and Proof– Essential part in this process is determine theoretical performance.
– Understanding of linear, sub-linear or super-linear performance.
– Simulate to determine actual.
• PARs can be gathered for other peripherals such as Load-Balancers, Storage Sub-Systems, Memory, CPUs, etc…
User Experience: High Performance and Heavy Integration (Reference Architecture)
• Insert Visio Here
Investment and Relationships
• PerfEng Team (10 Team Members)– Combined both teams as part of the merger and increased head count.
• Software Tools– Mercury LoadRunner– Quest Product Suite– Homegrown Tools: Simulation, Log Parsing, Modeling and Sampling
– Hotsos Oracle Profiler
Investment and Relationships
• Performance Lab Sponsors– Dell: Servers and Remote Lab– Sun: Servers, Storage and Remote Lab
– Intel: Servers– Coradiant: TrueSight Device – Quest: All Software Products– NetApp: Storage
Part 3: Process and MethodologyPart 3: Process and Methodology
Process and Methodology
SPE Overview…Assess
Performance Risk
Assessing the performance risk at the outset of the project (During Requirements) Identify, qualify and mitigate: Rapid Cognition Factors affecting risk: http://lightwave.blackboard.com/Engineering/1899
Identify CriticalUse Cases
Identify use cases where risk of performance goals not met causes the system to fail or be less than successful. Ranking of use cases based on workload variation, execution paths, processing considerations and utility.
Select Key Performance Scenarios
Most frequently executed scenarios, or those that are critical to the perceived performance of the system. Each performance scenario corresponds to a workload characterization. Define execution models, behavior models, cognition models, data models and processing models.
Establish Performance Objectives
Specify the quantitative criteria for evaluating the performance characteristics of the system under development. Must specify objectives prior to any simulations or analysis.
ConstructPerformance Models
Modeling techniques for representing the software processing steps for the performance model. Sequence Diagramming, Markovian Probability Models and Discrete Simulation Models
Software ExecutionModel
Determination of software resource utilization to appropriately measure effect of software as it scales in usage. Identification of Performance Anti-Patterns targeted for refactoring. Method-R Analytics and Problem Solving via Decision Tree and Pattern Recognition.
System ExecutionModel
Determination of system resource requirements utilized by the software under a given workload. Used for sizing and capacity models. Method-R Analytics and Problem Solving via Decision Tree and Pattern Recognition.
Method-R: Requirements of a Good Methodology (Milsap, Carey)
• Predictive Capacity: A method must enable the analyst to predict the impact of proposed remedy.
• Reliability: A method must identify the correct root cause of the problem, no matter what the root cause may be.
• Determinism: A method must guide the analyst through an unambiguous sequence of steps that always rely upon documented axioms, no experience or intuition.
Method-R: Requirements of a Good Methodology (Milsap, Carey)• Finiteness: A method must have a well-defined terminating condition, such as proof of optimality.
• Practicality: A method must be usable in any reasonable operating condition. It would be unacceptable for a performance improvement method to rely upon tools that exist in some other operating environment but not others.
Method-R: Response Time Performance Improvement• Practical way of thinking.• Often asking someone to be practical is in itself impractical.– Select the user actions that the business needs improved performance.
– Collect properly scoped diagnostic data that will allow you to identify the causes of a response time consumer while it is performed sub-optimally.
– Execute the candidate optimization activity that will have the greatest net payoff.
– Suspend your improvement activities until something changes.
Part 4: Performance Lab ‘07Part 4: Performance Lab ‘07
What projects are coming out of the PerfEng Lab in ‘07
• Blackboard Performance Sizing and Certification Program.
• Special Projects– Virtualization: Zen, LDOMs/Containers and VMWare
– Scalent Management Suite– Monitoring and Management– User Experience and Incident Management: Coradiant
– Storage Protocols: NFS, IP-SAN and FC-SAN
What is the BPSC?• The BPSC is a benchmarking program designed to
showcase the enterprise architecture, performance and scalability of the Blackboard Application Suite.
• The BPSC will help Blackboard customers make the appropriate purchasing decisions (hardware and software) to support their Blackboard implementation.
• The BPSC is a joint effort by Blackboard and members of the Blackboard Technology Family. This includes ISVs such as Microsoft, Oracle and Quest, as well as OEMs such Dell, Sun and Coradiant.
Special Projects• Virtualization
– VMWare, Zen and LDOMs• Management
– Scalent and Quest• Monitoring
– Coradiant and Quest• Storage Protocols
– FC/SAN, IP/SAN and NFS• Scale-Up and Scale-Out Databases
– Oracle RAC– 64-bit SQL Server
Questions?
Links/References• Blackboard Academic Suite Hardware Sizing Guide (Behind the Blackboard)• Performance and Capacity Planning Guidelines for the Blackboard Academic Suite (Behind the Blackboard)• http://www.perfeng.com• http://www.spec.org/sfs97r1/results/sfs97r1.html • http://www.storageperformance.org• http://www.coradiant.com• http://www.quest.com• http://www.bmc.com • Performance by Design : Computer Capacity Planning By Example; Menasce, Daniel• Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning; Menasce, Daniel• Linux Performance Tuning and Capacity Planning; Fink, Jason• Network Administrators Survival Guide; Deveriya, Anand• Capacity Planning for Internet Services; Cockcroft, Adrian• http://www.blackboard.com/docs/r6/6_3/en_US/admin/bbas_performance_capacity.pdf • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnbda/html/bdadotnetarch081.asp • http://developers.sun.com/solaris/articles/systemslowdowns.html • http://www.oracle.com/technology/deploy/performance/index.html• http://tpc.org/tpc_app/default.asp (TPC-App)• http://tpc.org/tpcw/default.asp (TPC-W)• http://java.sun.com/docs/performance/ • http://support.microsoft.com/kb/224587• http://www.javaperformancetuning.com• http://www.oraperf.com• http://www.ixora.com.au• http://www.hotsos.com• http://perl.apache.org/docs/1.0/guide/performance.html• Sherlog, Webalizer, WebTrends, Analog• http://dir.yahoo.com/Computers_and_Internet/Software/Internet/World_Wide_Web/Servers/Log_Analysis_Tools/ • http://www.serverwatch.com/tutorials/article.php/3518061 • http://www-106.ibm.com/developerworks/rational/library/4250.html• http://www.keynote.com/downloads/articles/tradesecrets.pdf • Whalen, Edward. Oracle Database 10G: Linux Administration ISBN: 0-07-223053-3;• Milsap, Cary. Optimizing Oracle Performance ISBN: 0-596-00527-X• DeLuca, Steve. Microsoft SQL Server 2000 Performance Tuning Technical Reference ISBN: 0735612706• McGehee, B. “SQL-Server Configuration Performance Checklist”
http://sql-server-performance.com/sql_server_performance_audit5.asp • http://www.sql-server-performance.com/jc_sql_server_quantative_analysis1.asp
Merci
top related