bmc bppm document
TRANSCRIPT
TECHNICAL WHITE PAPER
BMC ProactiveNet Performance ManagementDeep-Dive Application Diagnostics
TAbLE of CoNTENTs
IntroductIon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
ProactIve aPPlIcatIon Performance management . . . . . . . . . . . . . . . . . . . . . . . . . 2
Problem IsolatIon challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
automated root cause analysIs WIth deeP dIve dIagnostIcs . . . . . . . . . . . . . . . . . . . 3
Deep-Dive Application Diagnostics Components » . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Painless Deployment » . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Analysis: Starting with a Bird’s Eye View » . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Taking the Quick Dive » . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Part of a comPrehensIve busIness servIce management aPProach . . . . . . . . . . . . . . 9
summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1
INTRoduCTIoN Business applications are the lifeblood of your organization . When they fail, your company stands to lose
revenue and reputation . As a result, you likely find yourself under immense pressure to instantly fix problems
and restore service, all while dealing with demanding management, employees, customers, and partners .
To effectively handle the potential severe impact of application malfunctions, you must carefully and consistently
manage your applications so that, if and when they fail, you have the tools to:
Detect a problem “before the phone rings” »Assign the right priority to a problem, based on how it might impact the business »Isolate the root cause of a problem to determine which team needs to be engaged to diagnose and resolve »the problem
Resolve the problem before users and services are impacted »
Industry analysts estimate that 40 percent of downtime is typically caused by application failures . With
virtualized and cloud applications, they predict that application failures will increase to 60 percent and 80
percent, respectively . In today’s dynamic IT environment, where each application is composed of hundreds of
moving parts deployed in virtual and cloud environments, and where dynamic business requirements dictate
constant change, early detection and root cause isolation are key challenges . Wrestling with the inherent
complexity of today’s distributed Web applications, most organizations find themselves in an “all hands on deck”
exercise when problems occur, disrupting process-based operations and wasting the time of expert IT staff .
Knowing that a problem exists is only the first step in effective application performance management . More
important is knowing about a problem before it affects users and services; and most important is knowing the
cause of a problem so that you can resolve it as quickly and efficiently as possible .
The BMC Proactive Application Performance Management solution empowers IT staff to proactively detect
— then quickly triage, prioritize, and isolate — root cause . It also automates the resolution of application
performance and availability issues in distributed, mainframe, virtualized, and cloud-based (custom and pre-
packaged) applications . As a result, organizations can:
Improve availability » by proactively identifying application performance problems and automating problem
isolation and resolution
Improve service quality » by minimizing the impact of application performance problems on business
processes through proper prioritization of issues, avoidance of service outages, and resolution of problems
before service levels are affected
Reduce IT costs » by optimizing application problem isolation and diagnosis processes across the organization
and enabling effective collaboration between Application Development and IT Operations teams
By applying predictive analytics to end-user transactions, BMC ProactiveNet Performance Management (a
core product within the BMC Proactive Application Performance Mangaement solution) learns normal behavior
patterns, quickly detects when irregular end-user behavior occurs, and proactively triggers the automatic
capture of deep application diagnostics associated with degraded end-user transactions . This detailed
diagnostic information is immediately available to operators, thus speeding root cause analysis .
BMC ProactiveNet Performance Management - Application Diagnostics provides deep application diagnostics
collection to isolate problems in distributed Java EE, Windows, and .NET applications, helping you to:
Reduce escalations to Level 3 support »Accelerate application problem isolation and resolution (MTTR) »Improve availability and performance of critical business applications »Enable collaboration between Application Development and IT Operations teams »Reduce IT costs through improved process efficiencies; eliminating finger-pointing and “war-rooms” »
This paper presents the challenges of application problem isolation and focuses on the deep-dive diagnostic
capabilities of BMC ProactiveNet Performance Management – Application Diagnostics .
2
PRoACTIvE APPLICATIoN PERfoRmANCE mANAgEmENTThe BMC Proactive Application Performance Management solution learns the actual behavior of the end-user
experience and the supporting application infrastructure; detects subtle changes in behavior; and proactively
alerts on impending performance and availability issues at the earliest possible time . The solution also enables
rapid problem isolation and resolution by automating the analysis of learned behavior in conjunction with events
and change information to isolate the root cause and service impact of performance and availability issues .
At the core of the solution, BMC ProactiveNet Performance Management delivers the following application
performance management capabilities across mainframe, physical, virtual, and private cloud architectures:
Real (passive) and synthetic (active) end-user behavior monitoring of complex Web, client/server, and »mainframe applications running on mainframe, physical, virtual, and cloud (private, public, and hybrid)
architectures . The solution even includes optional adapters for collecting performance data and events from
various third-party end-user experience monitoring solutions, including Keynote, Gomez, and HP Business
Availability Center (Topaz suite) . The solution measures application availability and response times, and
performs transaction accuracy checking .
SAP®, Oracle®, and Siebel application management functionality for application administrators provides »additional insight into the performance and availability of packaged applications .
Application component (application, database, middleware) behavior monitoring provides additional insight »into the application tier .
Deep-dive application diagnostics collection automatically gathers data from Java EE, Windows, and .NET »application servers to provide visibility to determine which application component is causing a problem1 .
Since BMC ProactiveNet Performance Management takes the guesswork out of the equation, you can escalate
incidents to their appropriate owners and eliminate inefficient and frustrating “war room” situations .
The following additional capabilities are also available as part of BMC’s Proactive Application Performance
Management solution:
Application discovery and dependency modeling via BMC Atrium Discovery and Dependency Mapping »Transaction profile monitoring (including message middleware component health and availability) via BMC »Middleware Management
Mainframe application diagnostics (for CICS, DB2®, and IMS™) via BMC MainView Transaction Analyzer »Mainframe application component health and availability monitoring via BMC MainView products »
Together, these products deliver the industry’s only Proactive Application Performance Management solution —
delivering early problem identification with rapid problem isolation and resolution .
PRobLEm IsoLATIoN CHALLENgEsWith IT as an enabler of your organization’s business processes, you are most likely using Java EE, Windows,
or .NET application servers for backend business logic processing, integration of your enterprise applications,
and Web-based applications . These application environments, while providing numerous advantages, also add
several levels of complexity that make problem isolation a formidable challenge .
The multi-tier architecture of Java EE, Windows, and .NET-based applications relies on multiple networked
components, including client machines, load balancers, firewalls, Web servers, application servers, security
servers, transaction servers, and database servers . What’s more, the application server, in itself, is a highly
componentized entity . Increasingly, applications are being virtualized and deployed in cloud computing
environments, adding additional infrastructure components to manage .
1 The BMC Proactive Application Performance Management solution supports in-depth data collection and analysis for the following: JBoss, Tomcat, Java, and .NET servers; Weblogic, WebSphere MQ, WebSphere Application Server, and WebSphere Message Broker; IBM DataPower XI50; TIBCO EMS and TIBCO RV; Sun GlassFish Enterprise Server; Oracle Database and Oracle WebLogic; CICS, DB2, IMS, z/OS, USS, zLinux and z/VM; and IP, VTAM, abd DASD storage . However, the focus of this paper is on the BMC ProactiveNet Performance Management – Application Diagnostics component that supports Java and .NET servers .
3
Add to the inherent complexity of distributed and virtualized applications the frequent changes these
applications go through due to regular maintenance, fixes, and new business requirements, and it is easy to
see why proper application performance management is vital .
Running distributed applications composed of so many different moving parts means that multiple teams
touch the application, including the IT Operations staff who manage the servers, the DBAs who set up the
database, the security engineers who own firewalls and authentication servers, the mainframe system
administrators, the network administrators, and others .
For example, when a bank employee attempts to execute a transaction and receives poor performance, it’s
unclear what is causing the problem and who needs to fix it . Is it a network hiccup? Insufficient application
server connection pools? An overloaded backend server? A bug in a Java EE component? etc .
When critical problems occur, the typical response for many IT organizations is to summon representatives
from all functional teams (both within and outside the organization), shut them in a big meeting room, and
let them “figure it out .” Industry analysts estimate that, on average, 10-14 people are involved when a single
application or service outage occurs . Subsequent problem analysis is based on vast amounts of log files,
memory dumps, end-user reports, performance monitoring statistics, and guesswork .
Needless to say, these problem isolation methods are extremely inefficient . With little data (or too much
irrelevant data) to go on, finger-pointing is common, and IT Operations often spend extensive amounts of time
proving their innocence on problems that have nothing to do with their domain . For example, some database
transactions may not be processed due to incorrect configuration of the application server or a bug in a Java
EE component; hence DBAs would waste their time sitting in the “war room .” Even worse, the lack of clear
visibility into application transaction execution means longer mean time to repair (MTTR), thereby increasing
the costs associated with application downtime .
AuTomATEd RooT CAusE ANALysIs WITH dEEP-dIvE dIAgNosTICsTo accelerate problem isolation and minimize business disruptions, you need fast, reliable root cause
information with detailed application transaction diagnostic data — every single time a problem occurs .
With the BMC Proactive Application Performance Management solution, application problems are proactively
detected through a powerful combination of end-user behavior monitoring, coupled with application
infrastructure behavior monitoring, real-time predictive root cause, and service impact analytics .
As part of the solution, BMC ProactiveNet Performance Management analyzes and learns the behavior of
real and synthetic end-user experiences, as well as the application infrastructure components, and alerts IT
Operations when degraded performance or availability issues occur . Based on recent and current trends in
end-user and application behavior, the solution can also alert IT Operations about a potential problem that is
likely to occur within the next few hours .
When a subtle change in end-user or application behavior is detected by the real-time predictive analytics
engine, BMC ProactiveNet Performance Management generates a predictive alert or an abnormality, and
proactively triggers the automatic capture of deep application diagnostics, which then can be automatically
associated with the degraded end-user transactions (see Figure 1) . Because the detailed diagnostic data are
captured when the problem occurs, there is no need to recreate the problem .
4
Figure 1 . The service model changes status when BMC ProactiveNet Performance Management detects subtle changes in normal end-user or application behavior, indicating current or potential degraded performance or failures .
To accelerate problem isolation, BMC ProactiveNet Performance Management provides on-demand root
cause analysis for every event . The real-time predictive analytics engine automates the analysis and
correlation of learned behavior, events/alerts, and change information (e .g ., BMC BladeLogic or BMC Remedy
changes); ensuring IT staff have enough information available to quickly isolate the most likely root cause(s)
of a problem and determine its impact on the business . This combination of early detection with automated
root cause and service impact analytics allows IT Operations to find application performance problems at the
earliest possible time and drive fast and efficient automated repair of problems .
This powerful combination of behavior learning, predictive root cause, service impact analytics, and
continuous deep-dive diagnostic collection enables IT Operations to quickly and efficiently detect, prioritize,
and isolate application problems and route them to the appropriate person or team for resolution . As a result,
you can avoid costly application and service outages .
dEEP-dIvE APPLICATIoN dIAgNosTICs ComPoNENTsBMC ProactiveNet Performance Management – Application Diagnostics helps IT Operations and Application
Support staff to isolate performance and availability problems in the application tier of distributed applications
running in Java EE, Microsoft .NET, or COM/COM+ application environments .
By quickly determining where the root cause of a problem lies, the solution enables IT staff to route the
problem to the appropriate domain expert for rapid resolution . By eliminating the need to involve multiple
IT groups to diagnose a problem, the entire problem resolution process is expedited, service is restored
promptly, and end users are either unaware that a problem was averted or are simply satisfied that they are
productive again .
5
BMC ProactiveNet Performance Management - Application Diagnostics consists of the following main components:
Agents. » BMC ProactiveNet Performance Management - Application Diagnostics agents are lightweight
software agents deployed on production Java EE or .NET application servers . The primary role of the agents is
to continuously gather deep diagnostic data on application transaction performance, execution, and errors for
inclusion in application root cause analysis . These agents are based on the patented BMC AppSight “Blackbox”
technology, which automates the entire process to record application execution and captures a synchronized
record of system events, performance metrics, configuration data, and code execution flow . The “BlackBox” can
record application execution at all levels of detail, either locally or at remote sites, without requiring any change
to the application or the application server environment . You have complete control of the recording session, even
allowing you to switch to a greater level of recording detail when problems arise, without having to restart your
application .
server. » The BMC ProactiveNet Performance Management - Application Diagnostics server (i .e ., BMC AppSight
Server) is a middle-tier component that connects agents to the BMC ProactiveNet Performance Management
server, providing access to captured data, which can be stored in a database and/or as XML files on the file
system .
Console. » Operators can view the recorded detailed diagnostics data from a BMC ProactiveNet Performance
Management event within the solution’s operations console . Recorded operations/transactions can be replayed
to facilitate rapid problem identification and diagnosis .
Figure 2 depicts the basic components and flow of diagnostic data being collected .
Figure 2 . Deep-dive diagnostic data is collected continuously for inclusion in root cause analysis .
6
PAINLEss dEPLoymENTWhen business applications fail to perform, IT needs to act quickly to restore service . It is essential that any
solution used in the course of the problem isolation and resolution process is easy to deploy and use . After all,
IT staff need to spend their time finding the root cause(s) and fixing problems rather than expending effort to
manage their management tools .
BMC ProactiveNet Performance Management - Application Diagnostics requires no change to monitored
application environments . You do not need to modify Java EE application server startup scripts, run a special
version of the Java Virtual Machine (JVM) or Common Language Runtime (CLR), or change application code .
You can install BMC ProactiveNet Performance Management - Application Diagnostics agents through the
command line or by using any existing deployment tools . The Java EE version of the agent is packaged as
an EAR file and can be deployed using the Java EE application server administrator console . The Windows/ .
NET version is packaged as a Windows service that can be easily deployed directly or through your standard
software distribution processes and tools .
After installing the agents, you can get started with gathering transaction execution data from your
application . BMC ProactiveNet Performance Management - Application Diagnostics, designed with special
focus on simplicity, comes with predefined configurations for monitoring all common distributed application
environments; hence no scripting or special customization is involved . The tool’s ease-of-use enables a high
level of flexibility, allowing users to choose between running agents continuously or deploying and running
them only when problems occur .
ANALysIs: sTARTINg WITH A bIRd’s EyE vIEWThe location of a problem’s root cause is rarely known when analysis begins . Therefore, you have to start
by looking at the big picture and finding “suspect” tiers or components before drilling down . The Technology
Breakdown view shown in Figure 3 provides you with exactly that . It displays duration data on transaction
performance as recorded within the application server . Using this view, you can easily spot slow-performing
transaction categories and determine which tier may have caused the issue .
For example, consider a situation where an application support engineer is tasked with isolating the root
cause of a performance slowdown in an online Java EE-based trading application . The engineer may find that
a certain type of account verification transaction performs poorly when compared to other transactions or to
historical performance . Looking at the application transaction performance breakdown, the engineer sees
that the majority of time was spent on the database side .
7
Figure 3 . Application diagnostics break down the application technology to quickly identify where the transaction is spending the most time .
TAkINg THE QuICk dIvEWhile a high-level view is a good starting point for analysis, it rarely suffices for root cause isolation, as it
only tells part of the story . Before making the final determination as to where the root cause lies, you need
to investigate problematic transactions and understand their actual execution performance at a more
granular level .
Rather than execute a different tool to gather more detailed information or sift through long server
logs, BMC ProactiveNet Performance Management lets you drill down into the problematic transaction
invocations at a click of a button .
The solution’s Application Transaction Breakdown view presents actual transaction execution, including
full transaction execution path — SQL queries, EJB calls, Servlets, JSPs, JMS, JCA, JTA, JNDI, ASP/Xs,
COM/COM+, and more — made in the context of the transaction . Performance data is displayed for each
of the transaction steps . Figure 4 illustrates some of the application components that might be displayed
when application performance degradation occurs .
8
Figure 4 . The transaction breakdown listed in the invocation tree helps you pinpoint the component(s) within the transaction that are consuming the most time .
If the support engineer zooms in on the account verification transaction from Figure 3, and further evaluates
the transaction breakdown, he/she notices that a certain type of JDBC call takes an exceptionally long time
to complete, and throws an exception . BMC ProactiveNet Performance Management displays the full SQL
query that was sent to the database, enabling the engineer to realize that the transaction sends a request to an
external database on the company partner’s extranet . When the incident is escalated to level 3, a simple mouse-
click allows the application developer to drill down further to the actual line of code details and parameter values
that existed at the time the degradation occurred (see Figure 5) .
As a result, instead of spending countless hours and numerous individuals’ time, the problem is quickly and
accurately isolated and escalated to the partner’s help desk team for resolution .
Figure 5 . Deep-dive diagnostic drill-down from BMC ProactiveNet Performance Management into BMC AppSight shows parameter values, object states, and lines of code, thus enabling collaboration between application developers and IT Operations and facilitating rapid problem isolation and resolution .
9
PART of A ComPREHENsIvE busINEss sERvICE mANAgEmENT APPRoACHBusiness Service Management (BSM) is a comprehensive and unified platform that simultaneously
optimizes IT costs, demonstrates transparency, increases business value, controls risk, and assures
quality of service . BSM simplifies, standardizes, and automates IT processes, so you can efficiently
manage business services throughout their lifecycle — across distributed, mainframe, virtual, and
cloud-based resources . With BSM, your organization has the trusted information it needs, so you can
prioritize work based on critical business services and orchestrate workflow across your IT management
processes and functions .
The BMC Proactive Application Performance Management solution follows IT Infrastructure Library®
(ITIL®) guidelines on problem investigation and diagnosis, and helps you achieve BSM through a unified
architecture that enables you to:
Exceed service level commitments by focusing on what’s really important to the business »Reduce application outages by solving issues before service levels are affected »Improve first-time resolution and slash the time it takes to repair application problems by more than »75 percent with accurate root cause and diagnostic information
Accelerate application problem resolution by eliminating the need to reproduce problems »Drive business value by automating manual workflows and actions across multiple vendors, »platforms, and sources
summARyBMC’s Proactive Application Performance Management solution empowers IT staff to proactively detect
— and then quickly and efficiently triage, prioritize, and isolate — root cause . It also automates the
resolution of application performance and availability issues in distributed, mainframe, virtualized, and
cloud-based applications . As a result, IT organizations can:
Improve availability » by proactively identifying application performance problems and automating
problem isolation and resolution
Improve service quality » by minimizing the impact of application performance problems on business
processes through proper prioritization of issues, avoidance of service outages, and resolution of
problems before service levels are affected
Reduce IT costs » by optimizing application problem isolation and diagnosis processes across the
organization and enabling effective collaboration between Application Development and IT Operations
teams
As a core product within this solution, BMC ProactiveNet Performance Management provides deep-
dive application diagnostics capabilities that isolate problems in distributed Java EE, Windows and .NET
applications, helping you to:
Reduce escalations to Level 3 support »Accelerate application problem isolation and resolution (MTTR) »Improve availability and performance of critical business applications »Enable collaboration between Application Development and IT Operations Teams »Reduce IT costs through improved process efficiencies; eliminating finger-pointing and “war-rooms” »
To learn more about BMC Proactive Application Performance Management (APM) and BMC ProactiveNet
Performance Management, please visit www.bmc.com/products/offering/bmC-ProactiveNet-Performance-management.html
bmc, bmc software, and the bmc software logo are the exclusive properties of bmc software, Inc ., are registered with the u .s . Patent and trademark office, and may be registered or pending registration in other countries . all other bmc trademarks, service marks, and logos may be registered or pending registration in the u .s . or in other countries . saP r/3 is the trademark or registered trademark of saP ag in germany and in several other countries . oracle is a registered trademark of oracle corporation . db2 and Ims are trademarks or registered trademarks of International business machines corporation in the united states, other countries, or both . It Infrastructure library® is a registered trademark of the office of government commerce and is used here by bmc software, Inc ., under license from and with the permission of ogc . ItIl® is a registered trademark, and a registered community trademark of the office of government commerce, and is registered in the u .s . Patent and trademark office, and is used here by bmc software, Inc ., under license from and with the permission of ogc . all other trademarks or registered trademarks are the property of their respective owners . © copyright 2008, 2009, 2010 . bmc software, Inc . all rights reserved .
*199306*
Business runs on IT. IT runs on BMC Software.Business thrives when IT runs smarter, faster and stronger . That’s why the most demanding IT
organizations in the world rely on BMC Software across distributed, mainframe, virtual and cloud
environments . Recognized as the leader in Business Service Management, BMC offers a comprehensive
approach and unified platform that helps IT organizations cut cost, reduce risk and drive business profit .
For the four fiscal quarters ended December 31, 2010, BMC revenue was approximately $2 billion .