sfeldman performance bb_worldemea07

Post on 27-Jan-2015

105 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Principles of Performance Engineering

Steve Feldman, Director Performance Engineering and Architecturesfeldman@blackboard.com

Agenda

• Questions of the Mind…• PE @ Bb• Process and Methodology• What projects are coming out of the PerfEng Lab in ’07

• Links

Part 1: Questions of the MindPart 1: Questions of the Mind

Questions of the Mind…• What is Performance?• What is Scalability?• What is Performance Engineering?• Why does Blackboard invest in Performance Engineering?

What is Performance?

• Performance = Response Times• Response Times affect the User Experience.

• If the User Experience is acceptable Abandonment becomes less likely which can positively affect Adoption.

• When Adoption increases, greater the need for Scalability.

What is Scalability?

• Scalability = Heavy Adoption/Usage – If Heavy Adoption = Positive User Experience

• If Positive User Experience = Acceptable Response Times

• Patterns of predictable usage– Present Population– User Behavior

• Unpredictable Behavior and New Populations.– Make the unpredictable predictable.

• Changes in Adoption Growth Behavior

What is Performance Engineering?

• Engineering discipline with a primary focus on performance and scalability.

• Combination of Software and System activities with intent of improving Performance and Scalability.

• Performance Design Performance Development Performance Verification Performance Benchmarking

Why does Blackboard invest in Performance Engineering?• Some of the world’s largest application deployments outside of commercial portals and e-commerce sites are Blackboard.

• Desire to add more applications, sub-systems and features to drive adoption and increase growth.

• Differentiate Blackboard from other LMS/CMS ISVs and the Open Source Community.

Part 2: PE @ BbPart 2: PE @ Bb

PerfEng at Bb…• Triangle of Priorities

– Performance: Response Times (Apdex)– Scalability: High Session Activity with Low Abandonment (PAR)

– User Experience: High Performance and Heavy Integration (Reference Architecture)

• Investment and Relationships– Software Tool Set– Lab Sponsors

Triangle of Priorities

PerfEng at Bb…

RequirementDevelopment

FunctionalTesting

GeneralAvailability

RegressionTesting

CertificationIntegrated Testing

DevelopDesign

PerformanceBenchmarking

- Assess Performance Risk- Mitigate Performance Risk- Identify Critical Use Cases for Analysis

- Performance Workbooks- Review Technical Design Document- Reference acceptable design patterns- Warn about unacceptable anti-patterns- Model/Prototype

- Baseline as functionality can be tested- Profile for inefficient calls/executions- Identify scalability issues in time to refactor

- High-Watermark Load Testing- Common Scenario Load Testing- Conditional Scenario Load Testing- Java Performance Scenarios

- Platform Configurations- Advanced Configurations- Partner Benchmarking (Vendor Kits)

- Sizing and Capacity Guidance

End to End Performance Integration in the Blackboard Software Development Lifecycle

- Performance Verification- Focus Verification- QA Cyclical Requirements

Performance: Response Times (Apdex)

• Every transaction in the application is defined as a state, an action state or an action.

• Each transaction is assigned a ranking: critical, essential or trivial.

• Each transaction is assigned an abandonment policy: low, medium, high and very high.– Abandonment represents (4) dimensions of Apdex.

State/Action Modeling• State: A condition or point of reference within a sub-system in which an actor has the option to move to a sub-state, perform an action or move to a super-state. – States are often considered navigation items easily identified by a bread crumb.

– States can also be considered pages if and only if a sub-state or action is branched from the state. • Example: Discussion Board Forum List Widget (State of the Discussion Board sub-system)

State/Action Modeling

• Action State: The navigation to an action. – For example, an instructor wants to create a message for a topic. The instructor selects Add Message and is brought to a page that requires the user to input information followed by a submit to actually create the message.

State/Action Modeling

• Action: An actor driven process that occurs within a state. – Actions occur when an actor can not move into a sub-states.

– Most often associated with a use case. •Example: Replying to a thread within a Discussion Board Message.

What is Apdex(Application Performance Index)?

• What is Apdex– Apdex is an open standard developed by an alliance of companies that defines a standardized method to report, benchmark, and track application performance.

• http://www.apdex.org

Apdex and User Abandonment

• Bb defines (4) Apdex Dimensions: – Low (2s-8s)– Medium (5s-20s)– High (8s-32s) – Very High (12s-48s)

Apdex and Workload Variations• Apdex scores are taken across each data model variation.• Expect scores of 85% or higher for under-loaded systems.• During Verification Testing and Benchmarks, expect

scores of 75% or higher.• If scores return below accepting, move to

instrumentation and profiling (Method-R)

Scalability: High Session Activity with Low Abandonment (PAR)

• What is a PAR?– Performance Archetype Ratio– Scoring method to determine resource requirements of a deployment based on given system workload.

• Any component of the deployment can have a PAR score.

Scalability: High Session Activity with Low Abandonment (PAR)

X-Axis: IterationsX-Axis: Iterations

Y-A

xis

: R

esou

rce U

tiliza

tion

Y-A

xis

: R

esou

rce U

tiliza

tion

Resource Utilization Threshold LineResource Utilization Threshold Line

Optimal WorkloadOptimal Workload

CPCPUU

PAR Process: Step 1 Calibration

• Calibrate Workloads with User Abandonment– Peak of Concurrency (POC): The virtual user workload in which response times are acceptable and the highest volume of virtual users are park of the scenario.

– Level of Concurrency (LOC): The virtual user workload in which response times are acceptable the steadiest volume of virtual users are participating in the scenario.

– Average Concurrency: The average workloads of the POC and LOC measurements combined.

PAR Process: Step 2 App. Saturation

• Take workloads from abandonment run and disable abandonment.– Run based on Peak of Concurrency workload (Abandonment

Disabled) – Run based on Level of Concurrency workload (Abandonment

Disabled) – Run based on Average of Concurrency workload (Abandonment

Disabled)• Example Metrics

– Response times consistently lower then ~5 seconds – Application CPU saturation close to X > 90% where X = CPU

utilization + (1) Standard Deviation of the CPU Utilization – Total Sessions – Total Transactions – Application Server Hits Per Second – Database CPU saturation

• Strategies– Clustering and Virtualization

PAR Process: Step 3 DB Saturation

• Multiply the workload from Step 2 across identical application servers.– Typically want 90% CPU utilization and sub-5

second response times.• Example Metrics

– Database CPU saturation close to X > 80% where X = CPU utilization + (1) Standard Deviation of the CPU Utilization

– Memory Utilization– Database Shadow Processes– I/O operations per second

• Strategies– Increase CPU speed and count– Optimize storage configuration

PAR Process: Additional Steps

• Hypothesis and Proof– Essential part in this process is determine theoretical performance.

– Understanding of linear, sub-linear or super-linear performance.

– Simulate to determine actual.

• PARs can be gathered for other peripherals such as Load-Balancers, Storage Sub-Systems, Memory, CPUs, etc…

User Experience: High Performance and Heavy Integration (Reference Architecture)

• Insert Visio Here

Investment and Relationships

• PerfEng Team (10 Team Members)– Combined both teams as part of the merger and increased head count.

• Software Tools– Mercury LoadRunner– Quest Product Suite– Homegrown Tools: Simulation, Log Parsing, Modeling and Sampling

– Hotsos Oracle Profiler

Investment and Relationships

• Performance Lab Sponsors– Dell: Servers and Remote Lab– Sun: Servers, Storage and Remote Lab

– Intel: Servers– Coradiant: TrueSight Device – Quest: All Software Products– NetApp: Storage

Part 3: Process and MethodologyPart 3: Process and Methodology

Process and Methodology

SPE Overview…Assess

Performance Risk

Assessing the performance risk at the outset of the project (During Requirements) Identify, qualify and mitigate: Rapid Cognition Factors affecting risk: http://lightwave.blackboard.com/Engineering/1899

Identify CriticalUse Cases

Identify use cases where risk of performance goals not met causes the system to fail or be less than successful. Ranking of use cases based on workload variation, execution paths, processing considerations and utility.

Select Key Performance Scenarios

Most frequently executed scenarios, or those that are critical to the perceived performance of the system. Each performance scenario corresponds to a workload characterization. Define execution models, behavior models, cognition models, data models and processing models.

Establish Performance Objectives

Specify the quantitative criteria for evaluating the performance characteristics of the system under development. Must specify objectives prior to any simulations or analysis.

ConstructPerformance Models

Modeling techniques for representing the software processing steps for the performance model. Sequence Diagramming, Markovian Probability Models and Discrete Simulation Models

Software ExecutionModel

Determination of software resource utilization to appropriately measure effect of software as it scales in usage. Identification of Performance Anti-Patterns targeted for refactoring. Method-R Analytics and Problem Solving via Decision Tree and Pattern Recognition.

System ExecutionModel

Determination of system resource requirements utilized by the software under a given workload. Used for sizing and capacity models. Method-R Analytics and Problem Solving via Decision Tree and Pattern Recognition.

Method-R: Requirements of a Good Methodology (Milsap, Carey)

• Predictive Capacity: A method must enable the analyst to predict the impact of proposed remedy.

• Reliability: A method must identify the correct root cause of the problem, no matter what the root cause may be.

• Determinism: A method must guide the analyst through an unambiguous sequence of steps that always rely upon documented axioms, no experience or intuition.

Method-R: Requirements of a Good Methodology (Milsap, Carey)• Finiteness: A method must have a well-defined terminating condition, such as proof of optimality.

• Practicality: A method must be usable in any reasonable operating condition. It would be unacceptable for a performance improvement method to rely upon tools that exist in some other operating environment but not others.

Method-R: Response Time Performance Improvement• Practical way of thinking.• Often asking someone to be practical is in itself impractical.– Select the user actions that the business needs improved performance.

– Collect properly scoped diagnostic data that will allow you to identify the causes of a response time consumer while it is performed sub-optimally.

– Execute the candidate optimization activity that will have the greatest net payoff.

– Suspend your improvement activities until something changes.

Part 4: Performance Lab ‘07Part 4: Performance Lab ‘07

What projects are coming out of the PerfEng Lab in ‘07

• Blackboard Performance Sizing and Certification Program.

• Special Projects– Virtualization: Zen, LDOMs/Containers and VMWare

– Scalent Management Suite– Monitoring and Management– User Experience and Incident Management: Coradiant

– Storage Protocols: NFS, IP-SAN and FC-SAN

What is the BPSC?• The BPSC is a benchmarking program designed to

showcase the enterprise architecture, performance and scalability of the Blackboard Application Suite.

• The BPSC will help Blackboard customers make the appropriate purchasing decisions (hardware and software) to support their Blackboard implementation.

• The BPSC is a joint effort by Blackboard and members of the Blackboard Technology Family. This includes ISVs such as Microsoft, Oracle and Quest, as well as OEMs such Dell, Sun and Coradiant.

Special Projects• Virtualization

– VMWare, Zen and LDOMs• Management

– Scalent and Quest• Monitoring

– Coradiant and Quest• Storage Protocols

– FC/SAN, IP/SAN and NFS• Scale-Up and Scale-Out Databases

– Oracle RAC– 64-bit SQL Server

Questions?

Links/References• Blackboard Academic Suite Hardware Sizing Guide (Behind the Blackboard)• Performance and Capacity Planning Guidelines for the Blackboard Academic Suite (Behind the Blackboard)• http://www.perfeng.com• http://www.spec.org/sfs97r1/results/sfs97r1.html • http://www.storageperformance.org• http://www.coradiant.com• http://www.quest.com• http://www.bmc.com • Performance by Design : Computer Capacity Planning By Example; Menasce, Daniel• Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning; Menasce, Daniel• Linux Performance Tuning and Capacity Planning; Fink, Jason• Network Administrators Survival Guide; Deveriya, Anand• Capacity Planning for Internet Services; Cockcroft, Adrian• http://www.blackboard.com/docs/r6/6_3/en_US/admin/bbas_performance_capacity.pdf • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnbda/html/bdadotnetarch081.asp • http://developers.sun.com/solaris/articles/systemslowdowns.html • http://www.oracle.com/technology/deploy/performance/index.html• http://tpc.org/tpc_app/default.asp (TPC-App)• http://tpc.org/tpcw/default.asp (TPC-W)• http://java.sun.com/docs/performance/ • http://support.microsoft.com/kb/224587• http://www.javaperformancetuning.com• http://www.oraperf.com• http://www.ixora.com.au• http://www.hotsos.com• http://perl.apache.org/docs/1.0/guide/performance.html• Sherlog, Webalizer, WebTrends, Analog• http://dir.yahoo.com/Computers_and_Internet/Software/Internet/World_Wide_Web/Servers/Log_Analysis_Tools/ • http://www.serverwatch.com/tutorials/article.php/3518061 • http://www-106.ibm.com/developerworks/rational/library/4250.html• http://www.keynote.com/downloads/articles/tradesecrets.pdf • Whalen, Edward. Oracle Database 10G: Linux Administration ISBN: 0-07-223053-3;• Milsap, Cary. Optimizing Oracle Performance ISBN: 0-596-00527-X• DeLuca, Steve. Microsoft SQL Server 2000 Performance Tuning Technical Reference ISBN: 0735612706• McGehee, B. “SQL-Server Configuration Performance Checklist”

http://sql-server-performance.com/sql_server_performance_audit5.asp • http://www.sql-server-performance.com/jc_sql_server_quantative_analysis1.asp

Merci

top related