copyright © 2005 micromuse inc. all rights reserved. effective end-to-end service management in a...
TRANSCRIPT
Copyright © 2005 Micromuse Inc. All rights reserved.
Effective End-to-End Service Management in a Dynamic Environment
an approach for model based management
Doug McClureSr. Manager, Service and Technology MonitoringEarthLink [email protected] 26, 2005
2Copyright © 2005 Micromuse Inc. All rights reserved.
EarthLink Overview
~5.4M Customers
> ~3.8M Dialup (Premium ~2.7M, Value ~1.1M)
> ~1.5M Broadband (Cable, xDSL)
> ~140K Web Hosting (Unix, Windows)
Dial Access Coverage > 94% of US Population
> ~16K Local Dial Access Numbers
> ~315K Active Modem Ports (~30% ELNK, ~70% Outsourced)
> ~180 PoPs (18 Core Backbone PoPs, three data centers, two labs)
Broadband Coverage
> ~286 Markets with Broadband Offerings (DSL-204 Cable-82)
Large and Diverse Infrastructure
> ~1700 Network Elements, ~1600 Server Elements
> Thousands of Access and WAN Circuits
> 17K Tagged Assets in Network (excluding desktops)
3Copyright © 2005 Micromuse Inc. All rights reserved.
aDSL
Cable
sDSL
Satellite
Point-to-point
Wholesale
1999Value
Dial
20022001Blackberry
PDAs
WiFi
Wireless Cards
1994Premium
Dial
2005Next
Generation Broadband
VoIP
Muni-WiFi
Wireless Voice
Value Dial-up Access
Wireless Data Service
Broadband Access
Premium Dial-up Access
2004Converged
Devices
Free VoIP
1995Unlimited
Access
2003Accelerator
From ISP to Total Communications Provider
4Copyright © 2005 Micromuse Inc. All rights reserved.
EarthLink’s Dynamic Environment
> One Network
> Many Partners
> Fast Follower
> Constant Change
> Emerging Technologies
> Unlimited Products and Services
5Copyright © 2005 Micromuse Inc. All rights reserved.
High Speed Connection
High SpeedModem
Router
Analog Telephone
Adapter (ATA)Standard
Telephone
One Phone
Detects WiFi NetworkDetects Cell
Network
Voice
Source: EarthLink Product Group
6Copyright © 2005 Micromuse Inc. All rights reserved.
802.11 Mesh
SecureGovernment Fixed
StandardBusiness
University/DistanceLearning
802.11Mesh
Users
Free PublicAccess Area
Residential
Non-MissionCritical
ApplicationsNon-MissionCritical
Applications
Hot Spot User
ISP BFixed User
ISP AFixed User
802.11 Mesh
802.11 Mesh
802.11 Mesh
PremiumBusiness
GbE GbE
GbE
GbE
EARTHLINK POP
Local ISPPeeringRouters
GbE
CapturePortal
RADIUSProxy
RADIUSProxy
DHCPServer
DHCPServer
CapturePortal
MunicipalRouter
GbE GbE
ServerLoad
Balancer
ServerLoadBalancer
>Wi-Fi Broadband (Open Access Wholesale ISP product)
>Digital Divide (a subsidized version of Wi-Fi Broadband)
>Occasional Use (Open Access Wholesale 24-hour pass)
>Municipal & Business Fixed Wi-Fi
>Parks & Public Spaces (free access in limited locations)
>Wi-Fi Broadband for Municipal Safety
>Wi-Fi Broadband for Municipal Workers
Municipal Wi-Fi
Source: EarthLink Product Group
7Copyright © 2005 Micromuse Inc. All rights reserved.
• Built for 200k + subscribers• 135 Square miles• 4000 nodes (AP’s and Gateways)• 24 towers• ~30 backhaul radio links• ~6 fiber backhauls
Philly Muni-WiFi
Source: EarthLink Product Group
8Copyright © 2005 Micromuse Inc. All rights reserved.
S111
ANY WEB BROWSER
PALM CLIENT
CLIENTClientApplications
PresentationLayer
ApplicationServicesLayer
InfrastructureLayer
CoreServicesLayer
HTML
S86S84
APIs
APIs
APIs
StorageS110
S91
S112
Tickets
S102
ANY WEB BROWSER
S83
HTML
S81
IMAP
S108 S104
API 1
S82
API 4 API 7
API 2
S88
S106
S101S100
SMTP
API 5API 3
POP3
API 6
S109
S90
HTMLHTML
S103
S107
CLIENT
S87
S105
S80
S85
To Other Systems
Infrastructure Events
Source: EarthLink Product Group
Complex Applications & Services
9Copyright © 2005 Micromuse Inc. All rights reserved.
EarthLink Approach for Managing Services
> Doing the basics really well EVENTS MATTER
> Bank of America Quality Commercial
> “We don’t worry about managing the millions of events in our environment. We focus on the process of managing one event very accurately and reliably and repeating that process millions of times”
> A solid foundation enables anything to be built upon it
> Best of breed focus pays off over the long haul and weathers changing technologies
> Align everything to key business activities, services, products and customers
10Copyright © 2005 Micromuse Inc. All rights reserved.
E2E Service Management: A Top Down Approach to a Bottom Up Problem
>Why this and dashboards are important?
> Provide real-time visibility into increasingly complex products, services and infrastructure
> Enable action and real-time decision making
> Reduce mean time to address and repair complex service problems
11Copyright © 2005 Micromuse Inc. All rights reserved.
What’s Really Important?
>Customer Acquisition and Retention
> Registration, shopping cart abandonment, serviceability, churn
> Inventory, Fulfillment, Shipping, Logistics, Distribution
> Velocity, accuracy, double-shipping
>Billing / AR
> Delivery, processing, aging, collections, X-Factor
>Customer, Operations or Business Support?
>Visibility or Action?
12Copyright © 2005 Micromuse Inc. All rights reserved.
Technician / Engineer
Management / Director
Executive
Focus Event Process Outcome
Response Mode
Reactive Responsive Proactive
Orientation Product / Silo Business Political
Repertoire Technology/Hardware System/Service Solution
Finance Price Cost Value
Different Levels Different Needs
Source: Unknown
13Copyright © 2005 Micromuse Inc. All rights reserved.
Here?
> Traditional Business Intelligence
> Static, Pre-Canned, Scheduled
> Batch Oriented
> Static Consolidated Repository
> What happened
> Yesterday
> Last Week
> Last Month
> Where you were
14Copyright © 2005 Micromuse Inc. All rights reserved.
> Real Time
> Action Oriented
> Dynamic, Adaptive, Ad-Hoc
> Event-Driven, Always On
> Cached, Dynamic Data
> Where You’re At
> Where You’re Heading
> What Lies Ahead?
> Provides a Sense of Urgency
> When do I need to make a decision?
Or Here?
15Copyright © 2005 Micromuse Inc. All rights reserved.
Providing Visibility and Enabling Action
> Two schools of thought
> Top down or bottom up
> Build Dashboards – focusing on “Real-Time”
> Hourly or more frequent
> Avoid overlap with data warehouse or BI function
> Identify the main message for your dashboards for each audience and each level
> Choose 3-5 key messages, themes or topics to communicate
> Must be aggregated, correlated and presented in a summarized views that prompt action
> Is the ship on the right path?
> Do I need to take some action to steer around the iceberg?
> How quickly do I need to take action?
16Copyright © 2005 Micromuse Inc. All rights reserved.
Service ComponentsServers, Storage, Network, Backups, Firewalls,
Applications, Services, Processes, Activities
Enterprise Components Network, Server, Storage Admin, Ticketing,
Monitoring, Asset/Inventory Management
Service ManagementQuality, Performance, Availability Measurement and Improvement
BusinessManagement
Visibility, Impact, Revenue, Churn,
Fulfillment, Compliance
Business Process
Source: Gartner, Modifications Doug McClure
A Solid Foundation for E2E Service Management
17Copyright © 2005 Micromuse Inc. All rights reserved.
Data Oriented
Information Oriented
LowerValue
Higher Value
Service Aligned
High Touch, Low Effort
Element Aligned
Low Touch, High Effort
EnrichmentEnrichment
Raw
Enriched
Managing Events
Managing Services
Longer MTTR
Shorter MTTR
Why the Right Events Matter
Source: Doug McClure
18Copyright © 2005 Micromuse Inc. All rights reserved.
Why an Information Model is Needed
>Simplify and map complex infrastructure, application and technology data
>Ties disparate data sources in numerous silos together as they relate to the business
>Common data models enable real-time monitoring and action oriented dashboards
>EarthLink recognized this need and created an “EarthLink Information Model”
> Captures key components of service and infrastructure
> Establishes relationships with key business services, processes and activities, applications and customers
19Copyright © 2005 Micromuse Inc. All rights reserved.
Managed Element
Core
Applications
Services
Network Object
Hardware
Storage
Packages
Operating System
Physical Location
Interface Mapping
Physical Connectivity
Notification Monitoring
Transactions
EarthLink Information Model
• DMTF CIM 2.9.x Based
20Copyright © 2005 Micromuse Inc. All rights reserved.
EarthLink Relationship Modeling
21Copyright © 2005 Micromuse Inc. All rights reserved.
Service Modeling: Getting What’s Important
>Discovery Interviews - what’s important in each line of business and at each level of the organization
>Inventory Based Approach with Gap Analysis
>Don’t forget to consider non-traditional data sources
>Identify important metrics and indicators in key functional areas that prompt action
>Focus on service activities, processes and flows
>Align to key executive needs, objectives, pain points
>This becomes the foundation of the service model
22Copyright © 2005 Micromuse Inc. All rights reserved.
Key functional Metric roll ups
Key Transx, Processes, Flows
ServiceComponent Data
What’s Important!
Partner Broadband API Registration
Order Entry /
UIPrequal B2B/EDI
Reg/Provis
FulflmntService Activtn
Billing
Functional Infrastructure Status, Performance, Quality
Functional Application Status, Performance, Quality
Functional Transaction Status, Performance, Quality, Quantity
Functional Service Status, Performance, Quality
Functional Change Management Status, Quality, Quantity (Change Requests & Downtimes)
Functional Break/Fix Ticket Status, Quantity
Functional Process, Workflow, Business Rule, Policy Status, Performance, Quality, Quantity
Service Function Model
Ser
vice
Mo
del
E2E Process, Workflow, Business Rule, Policy Status, Performance, Quality, Quantity
E2E Process, Workflow, Business Rule, Policy Status, Performance, Quality, Quantity
Service Function Model
Service Function Model
Service Modeling: Getting What’s Important
23Copyright © 2005 Micromuse Inc. All rights reserved.
Key Message Areas
Key Metrics
Key Metrics
Data Sources
Data Sources
Communicating Information over Data – Key to Linking IT to Business
Availability Performance Reliability
Server Monitoring Network Monitoring Service MonitoringLog Monitoring Other Monitoring Trouble Tickets
Known Errors Change Requests Maintenance Windows
Infrastructure
Activities, Flows,
ProcessesOverall Availability
(processes, activities, flows)
Transaction Time / SLOTransactions/Hour,
Sessions/HourTransactions/Machine
Account Creation Stats (Good/Bad)
Transaction Errors
Quality
Synthetic Testing and Monitoring, Real User Monitoring, Other Monitoring E2E Service Transactions
Key Metrics, Objectives, Indicators
Business Partner Customer
Sign Ups/Hour, Day, WeekRevenue, Churn
Bounty, Signups, Transaction Accuracy
Order Accuracy, Customer Satisfaction
E2E Service Message
24Copyright © 2005 Micromuse Inc. All rights reserved.
“Sea of Red”
“Low Touch, High Effort”
“Event Storms”
“Simple Correlation”
“High Touch, Low Effort”
“Synthetic Transaction Monitoring” “Real User Monitoring”
“E2E Service Correlation”
“Topology/Dependency Mapping” “BAM BSM BINGO Monitoring”
Original Source: Gartner, Modifications Doug McClure
What’s Your Organization Ready For?
25Copyright © 2005 Micromuse Inc. All rights reserved.
Recap
> Managing your network can be future proof
> The right foundation ensures success
> Do the basics really well – EVENTS MATTER
> Building the case for E2E Service Management
> How we’re applying this at EarthLink
> Modeling and why it’s important
> Applying this in your business environment
Please feel free to contact me to discuss this approach in detail or receive a detailed process overview that EarthLink followed during this initiative
26Copyright © 2005 Micromuse Inc. All rights reserved.
Supporting Material
27Copyright © 2005 Micromuse Inc. All rights reserved.
Top Level Service
28Copyright © 2005 Micromuse Inc. All rights reserved.
Key Service Functions
29Copyright © 2005 Micromuse Inc. All rights reserved.
Service Model High Level
30Copyright © 2005 Micromuse Inc. All rights reserved.
Service Model Low Level
31Copyright © 2005 Micromuse Inc. All rights reserved. 31
Quality
ReliabilityPerformance Availability
•Sign Ups/Hour, Day, Week•Failures/Hour/Day/Week•Overall Quality by E2E Service and Transaction
•By Partner, Service
•High Level Keynote, Compuware and Quest FTR Quality Metrics for Service/Service Transaction Testing
•Active Remedy Break/Fix Tickets•Active Change or Downtime Activities•Planned Change or Downtime Activities within 8, 12, 24, 48 hours
•Account creation metrics (good vs bad)•Transaction error metrics
•Infrastructure Monitoring – Foglight, NerveCenter, ISM Events, Keynote/Compuware/FTR, Other Netcool Events
•Infrastructure Monitoring – Foglight, NerveCenter, ISM Events, Keynote/Compuware/FTR, Other Netcool Events
•Average/Max/Min Transaction Time•Transaction time vs. SLO•Partner Transaction Time•Transactions/Hour, Sessions/Hour•Transactions/Machine
•Overall Availability or processes, activities, APIs, flows, etc.
Contextual Display Panel
Click Here
Display Here
Dashboard Mockups
32Copyright © 2005 Micromuse Inc. All rights reserved.
Order SentTo Vendor
E
Order Accepted
E
FOC Assigned
E
Customer Billed
D
Order Cancelled/Rejected
E
NoFOC
Assigned
E
FOC Missed
E
B2B / EDI / Translation Billing
WelcomeLetters
Shipped
D
Fulfillment
WEB
A
Order Entry / UI
APP
B
REG
C
Serviceability Query
Run
Serviceability
Order Created
B
Service Requested
by Customer
A
Serviceability / Pre-Qual
Equipment Shipped
(DSL)
D
Fulfillment
Registration / Provisioning
CustomerProvisioned
E
RADIUS Prov
F
Email/FTPWWW Prov
F
Order Entry / UI Servicability B2B/EDI/Trans
Reg/Prov Fulfillment Billing
Dashboard Mockups
33Copyright © 2005 Micromuse Inc. All rights reserved.
Drill Down into Metrics
Drill Down on Server
33
Server1Active EventsService Port Health
All Events
Server Health
App Health
Transaction Health
Remedy Tickets
Change Requests
Overall Service
Topology
All Metrics Associated with Server Element on One
Page
(same idea would apply to any widget drill down)
PerformanceQuality
Reliability
Availability
Dashboard Drilldown Mockups