it operations management guide

Upload: nealkir

Post on 05-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 IT Operations Management Guide

    1/136

  • 7/31/2019 IT Operations Management Guide

    2/136

  • 7/31/2019 IT Operations Management Guide

    3/136

    # IT OPERATIONS MANAGEMENT

    tweetBook01

    Managing Your IT Inrastructure in the Age o Complexity

    By Peter Spielvogel, Jon Haworth,

    Sonja Hickey

    E-mail: [email protected]

    20660 Stevens Creek Blvd., Suite 210,

    Cupertino, CA 95014

  • 7/31/2019 IT Operations Management Guide

    4/136

    Copyright 2011 by Peter Spielvogel, Jon Haworth, Sonja

    Hickey

    All rights reserved. No part o this book shall be reproduced,stored in a retrieval system, or transmitted by any meanselectronic, mechanical, photocopying, recording, orotherwise without written permission rom the publisher.

    Published by THiNKaha, a Happy About imprint20660 Stevens Creek Blvd., Suite 210, Cupertino, CA 95014http://thinkaha.com

    First Printing: April 2011

    Paperback ISBN: 978-1-61699-052-7 (1-61699-052-X)

    eBook ISBN: 978-1-61699-053-4 (1-61699-053-8)

    Place o Publication: Silicon Valley, Caliornia, USA

    Paperback Library o Congress Number: 2011926213

    The authors are donating all their royalties to the HP

    Foundation, which unds global disaster relie eforts. The

    publisher is matching their contribution dollar or dollar.

    Trademarks

    All terms mentioned in this book that are known to betrademarks or service marks have been appropriatelycapitalized. Neither Happy About, nor any o its imprints,can attest to the accuracy o this inormation. Use o a termin this book should not be regarded as afecting the validityo any trademark or service mark.

    Warning and Disclaimer

    Every efort has been made to make this book as completeand as accurate as possible. The inormation provided is onan as is basis. The author(s), publisher, and their agents

    assume no responsibility or errors or omissions. Nor dothey assume liability or responsibility to any person or entitywith respect to any loss or damages arising rom the use oinormation contained herein.

  • 7/31/2019 IT Operations Management Guide

    5/136

    Advance Praise

    A ast, insightul read or the IT proessional, presented in a unique ormat.

    A great distillation o IT best practices that will help ocus IT goals and

    acilitate decision making. Its also especially useul or the non-IT manager

    who needs IT support and wants to understand and influence the decision

    making process.

    Thomas Cheng, President, pcAge, Inc.

    As organizations start to make the transition rom traditional IT to the

    cloud, they will need to adapt their people, processes, and tools. And they

    will need some guidance that points them in the right direction. This book

    provides a concise roadmap that will help them along that journey.

    Kalyan Ramanathan, VP Marketing, Electric Cloud

    No single IT organization has all the answers to IT Operational

    Management because sometimes you dont know what questions to ask.

    This book allows you to start discussions to find the answers you are

    looking or. It is a great starting point or groups dedicated to process

    improvement and knowledge sharing.

    Henry Wojcik, Manager Operational Monitoring, CME Group

    Its all good common sense that we know we should be doing but dont get

    the time to dothis little book is a great reerence and reminder.

    Mark Laird, Technical Consultant, Steria UK Ltd.

    Brilliant ormat! Youll read it more than once. A thought-provoking

    source o meaningul discussion topics. You should pass the book around

    your group to generate ideas and proactive collaboration. The resulting

    communication is the true value o the text.

    Mark Laughlin, Former CIO, The Guitar Center

  • 7/31/2019 IT Operations Management Guide

    6/136

    #IT Operations Management tweet Book01 ofers a ast-paced, easy-

    to-read-and-skim resource or meaningully getting started in planning,

    preparing, and executing on an IT Operations Bridge. It balances a look at

    people, process, and technology issues and is careul not to promote one-

    sided advice that can be applicable in some environments, but destructive

    in others.

    Dennis Drogseth, Vice President, Enterprise Management Associates,

    Inc.

    The key to successul delivery o Hybrid IT will be collaboration and

    agreement across organizations, groups, process, and technology, and as

    such, this view o gaining a common set o principles that can be used

    or agreement and collaborations, even i they are adjusted specific to the

    needs o that organization, will be invaluable.

    Mark Potts, HP Fellow & VP of Portfolio Strategy , HP Software

    In order to meet our customer SLAs, we use a monitoring plan that

    includes inrastructure, application landscape, incident management, and

    perormance benchmarking. This book distills IT Operations Management

    to the core elements.

    Maryann Phillip, Director of Performance Management, Independence

    Blue Cross

    IT inrastructure management has never been so critical. All business

    verticals and IT organizations o all sizes will benefit rom implementing

    the ideologies and processes discussed throughout this booksimple

    IT management techniques which are both enlightening and thought

    provoking or IT proessionals at every level.

    Luigi Tiano, Enterprise Management Director, CT Consultants

    The book summarizes years o operations management experience into

    easy to digest concepts.

    Henry Yam, VP of Enterprise Management, Neuberger Berman

  • 7/31/2019 IT Operations Management Guide

    7/136

    Dedication

    To my amily or their unconditional love and support.

    To my parents or their motivation to always pursue excellence.

    Peter

    To my late mother, who raised three sons single-handedly and did a fine

    job. And to my amily who tolerate, encourage, and inspire me.

    Jon

    To my husband, Dan, and my children, Colin, Brianna, and Emma, who

    supported me throughout this project. And to my colleagues, Peter and Jon,

    who inspired and encouraged me to participate in the writing o this book.

    Sonja

  • 7/31/2019 IT Operations Management Guide

    8/136

    Acknowledgments

    Thanks to the HP Operations Center products and R&D teams or their

    tireless eforts.

    Thanks to our colleagues, who constantly challenge us to embrace new

    ideas.

    Thanks to our management, who enthusiastically supported this project.

    Thanks to our customers, who use their creativity to push the limits o ITmanagement.

    Thanks to eagle eye Stephanie or her prooreading skills.

    Special thanks to Mitchell Levy and the Happy About team or publishing

    this book.

    Peter, Jon, Sonja

  • 7/31/2019 IT Operations Management Guide

    9/136

    Why Did We Write This Book?

    Managing IT inrastructure has always been challenging. Virtualization and

    cloud computing make this task even more di cult. In our daily work, we

    encounter organizations struggling with these issues.

    IT proessionals want simple actionable ideas that will deliver gains in

    availability, perormance, and e ciency. We wrote this book with that in

    mind. Good IT is good business.

    Peter Spielvogel, Jon Haworth, Sonja Hickey

    Blog:www.hp.com/go/ITOpsBlog

    Twitter:@HPITOps

    Email:[email protected]

  • 7/31/2019 IT Operations Management Guide

    10/136

    Managing Your IT Inrastructure in the Age o Complexity

  • 7/31/2019 IT Operations Management Guide

    11/136

    9#IT OPERATIONS MANAGEMENT tweet Book01

    Contents

    Preace 11

    Section I

    IT and the Business 13

    Section II

    People 25

    Section III

    Process 39

    Section IV

    Technology 55

    Section V

    Architecture 73

    Section VI

    Perormance Management 81

  • 7/31/2019 IT Operations Management Guide

    12/136

    Managing Your IT Inrastructure in the Age o Complexity

    Section VII

    Virtualization 91

    Section VIII

    Cloud Computing 105

    Section IX

    Getting Started 115

    About the Authors 129

  • 7/31/2019 IT Operations Management Guide

    13/136

    #IT OPERATIONS MANAGEMENT tweet Book01 11

    Our goal is not to make you

    an expert on managing IT

    inrastructure, but to make you andyour peers think in a diferent way

    about some o your decisions. I our

    guidance starts a discussion in your

    team that results in you avoiding

    one big mistake, then we haveachieved our goal.

    Preface

  • 7/31/2019 IT Operations Management Guide

    14/136

    Section I: IT and the Business

  • 7/31/2019 IT Operations Management Guide

    15/136

    #IT OPERATIONS MANAGEMENT tweet Book01 13

    At the start o the dot-com era,

    people predicted that someday

    all businesses would becomee-businesses. That time is

    now. Strong alignment between

    business and IT is more important

    than ever. The key to success is

    communication o shared goals ina common language.

    Section I

    IT and the Business

  • 7/31/2019 IT Operations Management Guide

    16/136

    IT Operations tools are a businesssolution. They are not toys or IT.

    Justiy investment based on business

    benefits, not benefits or IT.

    1

    Using an IT monitoring solution

    proactively can create a competitive

    advantage in the marketplace byimproving quality o service.

    2

    Section I: IT and the Business

  • 7/31/2019 IT Operations Management Guide

    17/136

    #IT OPERATIONS MANAGEMENT tweet Book01 15

    Manage risk proactively.

    A major IT outage can

    hurt your company

    reputation, bottom line,

    and even the stock price.

    3

  • 7/31/2019 IT Operations Management Guide

    18/136

    Application owners

    ask 3 questions: Is my

    app available? Does

    my app perorm well?

    Is my app secure?

    You must provide the

    answers with data.

    4

    Section I: IT and the Business

  • 7/31/2019 IT Operations Management Guide

    19/136

    #IT OPERATIONS MANAGEMENT tweet Book01 17

    Diferent roles use diferent metrics:

    CIOservice level, cost o IT per

    app; NOC manageravailability,MTTR; app ownerrevenue, risk.

    6

    Prioritize IT response to incidentsbased on business impact. Your

    business users can tell you which

    processes are most important.

    5

  • 7/31/2019 IT Operations Management Guide

    20/136

    Be transparentshow service-level

    metrics to business owners to

    instill confidence.

    7

    Your line o business owners care

    deeply about risk. They will und

    investments in new tools i they canreduce downtime risk.

    8

    Section I: IT and the Business

  • 7/31/2019 IT Operations Management Guide

    21/136

    #IT OPERATIONS MANAGEMENT tweet Book01 19

    Cost/Minute o Downtime

    * Average Outage Time

    * Outages/Year

    = Cost justification

    or investing in a better

    monitoring solution.

    9

  • 7/31/2019 IT Operations Management Guide

    22/136

    Align monitoring with businesscriticality. Invest in top-tier

    unctionality when needed;

    otherwise, optimize spending.

    10

    Measure the return on investment

    or any new IT Operations tools orinitiatives you implement.

    11

    Section I: IT and the Business

  • 7/31/2019 IT Operations Management Guide

    23/136

    #IT OPERATIONS MANAGEMENT tweet Book01 21

    Monitor how end users perceiveyour applications. Ultimately, this

    is all that matters. Monitoring the

    inrastructure is not enough.

    12

    13Social media is a great way to

    monitor your reputation; i your IT

    is broken, people will talk about itonline (blogs, Twitter, etc.).

  • 7/31/2019 IT Operations Management Guide

    24/136

    For ITIL1 v3 Continual

    Service Improvement,leverage the data gathered

    by your IT monitoring

    tools to improve your

    business services!

    14

    Section I: IT and the Business

    1. ITIL = Inormation Technology InrastructureLibrary, a collection o best practices or IT

    practitioners. http://www.itil-oicialsite.com/

  • 7/31/2019 IT Operations Management Guide

    25/136

    #IT OPERATIONS MANAGEMENT tweet Book01 23

    User perception defines theperormance o the IT group; reduce

    negative perceptions through

    proactive incident management.

    15

    In many companies, IT Operations is

    no longer a commodity. CXOs see ITas a strategic business investment.

    16

  • 7/31/2019 IT Operations Management Guide

    26/136

    Section II: People

  • 7/31/2019 IT Operations Management Guide

    27/136

    #IT OPERATIONS MANAGEMENT tweet Book01 25

    Section II

    People

    People are the core o any IT

    monitoring solution. Without a

    capable and motivated team inplace, you will inevitably ail at

    keeping your inrastructure running

    and your business customers

    satisfied. Automating routine tasks

    keeps people engaged and ocusedon value-added activities.

  • 7/31/2019 IT Operations Management Guide

    28/136

    Focus your ops team on the

    important, not just the urgent.

    17

    Hire the best people, train them

    well, equip them with the best tools.

    They are the first line or incidentdetection and resolution.

    18

    Section II: People

  • 7/31/2019 IT Operations Management Guide

    29/136

    #IT OPERATIONS MANAGEMENT tweet Book01 27

    The head o

    Inrastructure and

    Operations now plays

    a key, strategic role in

    the business. This is no

    longer a cost center.

    19

  • 7/31/2019 IT Operations Management Guide

    30/136

    ITIL Process Owners

    define key perormance

    indicators (KPIs).

    They must bring in

    requirements rom an IT

    monitoring perspective.

    20

    Section II: People

  • 7/31/2019 IT Operations Management Guide

    31/136

    #IT OPERATIONS MANAGEMENT tweet Book01 29

    Centralize monitoring. Reduceduplication o efort. You dont need

    multiple teams watching consoles

    with flashing lights.

    21

    How many teams do you have

    managing first-level events? Anynumber greater than one is too many!

    22

  • 7/31/2019 IT Operations Management Guide

    32/136

    Experts love the instantgratification o fixing an issue, but

    its not their primary ocus. Let the

    Operations Bridge do its job!

    23

    24I you know the root cause o an

    incident, only one group needs to

    respond. This should be your goal iyou are serious about reducing cost.

    Section II: People

  • 7/31/2019 IT Operations Management Guide

    33/136

    #IT OPERATIONS MANAGEMENT tweet Book01 31

    25

    Unload day-to-day

    event management

    rom your subject

    matter experts. They

    should ocus on

    their primary jobs,

    not firefighting.

  • 7/31/2019 IT Operations Management Guide

    34/136

    26What % o incidents turn into costly,

    disruptive escalations to your

    subject matter experts? Reducing

    this % generates cost savings.

    Build a strong team. Do not allow

    your IT monitoring to become too

    dependent on one person. This ishigh risk i they leave.

    27

    Section II: People

  • 7/31/2019 IT Operations Management Guide

    35/136

    #IT OPERATIONS MANAGEMENT tweet Book01 33

    Use an ITIL RACI

    matrix (responsible,accountable, consulted,

    inormed) to define

    each role during event,

    incident, problem

    mgmt processes.

    28

  • 7/31/2019 IT Operations Management Guide

    36/136

    Cross-train and rotate

    team leaders into new

    domains or projects. This

    keeps them resh, brings

    new ideas, and generates

    cross-pollination.

    29

    Section II: People

  • 7/31/2019 IT Operations Management Guide

    37/136

    #IT OPERATIONS MANAGEMENT tweet Book01 35

    PARTICIPATE in (dont just join) an

    online or in-person community to

    share best practices with peers.

    30

    Experts are hard to find.

    Redirect manpower released

    rom daily operational activitiesto strategic business projects.

    31

  • 7/31/2019 IT Operations Management Guide

    38/136

    ITIL Service Assets =

    Resources (tangibles)

    Capabilities (intangibles).

    Peoples knowledge and

    expertise are vital assets.

    32

    Section II: People

    +

  • 7/31/2019 IT Operations Management Guide

    39/136

    #IT OPERATIONS MANAGEMENT tweet Book01 37

    Align the goals o all

    your Operations team.

    Focus on what matters:

    service availability,

    mean time to repair, cost

    o management.

    33

  • 7/31/2019 IT Operations Management Guide

    40/136

    Section III: Process

  • 7/31/2019 IT Operations Management Guide

    41/136

    #IT OPERATIONS MANAGEMENT tweet Book01 39

    Given the complexity o modern

    IT inrastructures, organizations

    need robust processes to

    maintain service levels atagreed-upon levels. While some

    larger companies may embrace

    the richness o ITIL v3, others

    can get by with documenting

    their own best practices andollowing them consistently.

    Section III

    Process

  • 7/31/2019 IT Operations Management Guide

    42/136

    Consistency counts. A poorlyperorming inrastructure

    can be worse than one that is

    completely broken.

    34

    35Monitor IT holistically

    include network, storage,

    servers, applications, andend-user experience.

    Section III: Process

  • 7/31/2019 IT Operations Management Guide

    43/136

    #IT OPERATIONS MANAGEMENT tweet Book01 41

    36

    You must build in

    and und monitoringwith new applications.

    Adding it on will not

    work. This is a

    key part o the

    development process.

  • 7/31/2019 IT Operations Management Guide

    44/136

    37

    Automate everywhere.

    Start small, build

    confidence, and

    expandmonitor,

    collect, correlate,

    determine impact,

    analyze, and resolve.

    Section III: Process

  • 7/31/2019 IT Operations Management Guide

    45/136

    #IT OPERATIONS MANAGEMENT tweet Book01 43

    38The Operations Bridge should

    see ALL alerts and take first

    actions. Do not allow ragmented

    alerting paths to emerge.

    39The end goal is to ocus as much

    o the day-to-day operations

    activity into the lower cost levelso the IT organization.

  • 7/31/2019 IT Operations Management Guide

    46/136

    41Track everything. # o incidents, # o

    escalations, time to repair, # o people

    required to fix, service availability,unscheduled downtime.

    Coordinate your OperationsBridge and Service Desk. Good

    collaboration will result in high

    customer satisaction.

    40

    Section III: Process

  • 7/31/2019 IT Operations Management Guide

    47/136

    #IT OPERATIONS MANAGEMENT tweet Book01 45

    Reuse your existing

    IT processes when

    possible, but make

    incremental changes

    as needed to drive

    e ciencies or adapt to

    new technologies.

    42

  • 7/31/2019 IT Operations Management Guide

    48/136

    Build agility intoyour processes with

    automatic responses to

    configuration changes

    and automated root

    cause analysis.

    43

    Section III: Process

  • 7/31/2019 IT Operations Management Guide

    49/136

    #IT OPERATIONS MANAGEMENT tweet Book01 47

    44An average event costs

    $75 to manually process.

    Reducing this throughautomation, or example,

    generates measureable

    cost savings.2

    2. A 2009 survey o two VIVIT user group meetings

    (n=100) ound that it costs an average U.S. company

    $75 to manually process an event. We have urther

    validated this number with several enterprise ITorganizations. VIVIT is an independent HP user

    community. http://www.vivit-worldwide.org/

  • 7/31/2019 IT Operations Management Guide

    50/136

    Document your best practices ortroubleshooting. Then automate them

    so everyone consistently perorms

    as well as your best operators.

    45

    Correlating events requires rules.

    Either you build them, someone

    writes them or you, or you createthem automatically.

    46

    Section III: Process

  • 7/31/2019 IT Operations Management Guide

    51/136

    #IT OPERATIONS MANAGEMENT tweet Book01 49

    Keep the Service Desk inormedabout incident status with automatic

    updates. They can assure customers

    that issues are being resolved.

    47

    48Integrate tools and technologies

    such that relevant data is passed

    to the next incident owner duringhand-of or escalation processes.

  • 7/31/2019 IT Operations Management Guide

    52/136

    Development and operations must

    work together to build monitoring

    processes into the sotware creationprocess. Some call this DevOps.

    50

    Use postmortem analysis to learnand understand the underlying

    causes o outages or IT problems,

    not to blame a person or team.

    49

    Section III: Process

  • 7/31/2019 IT Operations Management Guide

    53/136

    #IT OPERATIONS MANAGEMENT tweet Book01 51

    End-user monitoring

    is important, but the

    initial consumer o alerts

    must be the Operations

    Bridge and NOT the

    application teams.

    51

  • 7/31/2019 IT Operations Management Guide

    54/136

    For Agile environments, monitoring

    the IT inrastructure must be part othe dev/test/QA/release process.

    53

    Address IT management during ITILv3 Service Strategy phase to prevent

    unexpected costs during Service

    Design or Service Transition phases.

    52

    Section III: Process

  • 7/31/2019 IT Operations Management Guide

    55/136

    #IT OPERATIONS MANAGEMENT tweet Book01 53

    Automation is the key

    to consistency, lower

    costs, and higher service

    levels. ALWAYS look to

    automate IT processes.

    54

  • 7/31/2019 IT Operations Management Guide

    56/136

    Section IV: Technology

  • 7/31/2019 IT Operations Management Guide

    57/136

    #IT OPERATIONS MANAGEMENT tweet Book01 55

    Section IV

    Technology

    Leading-edge technology,

    especially automation, can

    significantly reduce the cost

    o managing IT inrastructure.Embedding automated

    operations at each stage o the

    management process ensures

    consistent response to recurring

    problems and generally speedsthe time to repair.

  • 7/31/2019 IT Operations Management Guide

    58/136

    Complexity = Cost

    55

    Section IV: Technology

  • 7/31/2019 IT Operations Management Guide

    59/136

    #IT OPERATIONS MANAGEMENT tweet Book01 57

    The Operations Bridge

    needs an up-to-dateview o the managed

    environment so their

    tools can provide good

    guidance on what to

    do next.

    56

  • 7/31/2019 IT Operations Management Guide

    60/136

    Use visualization tools to

    view complex business

    services. This greatly

    helps troubleshooting.

    57

    Section IV: Technology

  • 7/31/2019 IT Operations Management Guide

    61/136

    #IT OPERATIONS MANAGEMENT tweet Book01 59

    Empower your

    Operations Bridge

    with tools and run-

    book automation so

    they can fix as much

    as possible. Let them

    DO, not just SEE.

    58

  • 7/31/2019 IT Operations Management Guide

    62/136

    Tailor discovery o theOperations Bridge

    view o the IT world to

    their needsenough

    detail, rapidly updated

    when changes occur.

    59

    Section IV: Technology

  • 7/31/2019 IT Operations Management Guide

    63/136

    #IT OPERATIONS MANAGEMENT tweet Book01 61

    60Relating the monitoring inormation

    to business services requires an

    automatically updated model with

    ocused discovery to maintain it.

    Use classes to organize monitoring

    by similar types o servers. For

    example, all database servers can usethe same monitoring definitions.

    61

  • 7/31/2019 IT Operations Management Guide

    64/136

    Good models drive e cient eventcorrelation. But make sure you

    automate this process to keep up

    with changes in your environment.

    62

    Automate the interace between

    your event console and ticketing

    system. When monitoring notifiesyou o an incident, open a ticket.

    63

    Section IV: Technology

  • 7/31/2019 IT Operations Management Guide

    65/136

    #IT OPERATIONS MANAGEMENT tweet Book01 63

    Bad data is worse than

    no data. Make sure

    your monitoring is

    delivering accurate

    results. I something

    eels wrong, check it.

    64

  • 7/31/2019 IT Operations Management Guide

    66/136

    Use a configurationmanagement

    database (CMDB) to

    strategically track

    relationships among

    configuration items.

    65

    Section IV: Technology

  • 7/31/2019 IT Operations Management Guide

    67/136

    #IT OPERATIONS MANAGEMENT tweet Book01 65

    In your CMDB, assign anowner to each configuration

    item. When a problem arises,

    you know whom to contact.

    66

    Automate whatever you can,

    but only ater your processesare working smoothly.

    67

  • 7/31/2019 IT Operations Management Guide

    68/136

    68

    Simulate user sessions

    to give early warning on

    inrastructure problems.

    A person will not see a

    10% perormance drop.

    Automated tools will.

    Section IV: Technology

  • 7/31/2019 IT Operations Management Guide

    69/136

    #IT OPERATIONS MANAGEMENT tweet Book01 67

    Look or tools that

    deliver high value

    with low maintenance

    overhead. Otherwise,

    they will not scale to

    your environment.

    69

  • 7/31/2019 IT Operations Management Guide

    70/136

    It is impossible to

    anticipate every

    situation in your IT

    environment. Use

    intelligent automation

    to create and maintain

    correlation rules.

    70

    Section IV: Technology

  • 7/31/2019 IT Operations Management Guide

    71/136

    #IT OPERATIONS MANAGEMENT tweet Book01 69

    Modern monitoring

    should include a mobile

    component so your

    experts can resolve

    problems on the go,

    reducing MTTR.3

    71

    3. MTTR = mean time to repair

  • 7/31/2019 IT Operations Management Guide

    72/136

    Avoid monitoring silos or special

    technologies. Soon the technology

    will be mainstream and it must bemonitored as part o the whole.

    73

    72Free is not ree unless your timeis worth nothing. Think careully

    about the total cost o ownership or

    reeware or open source.

    Section IV: Technology

  • 7/31/2019 IT Operations Management Guide

    73/136

    #IT OPERATIONS MANAGEMENT tweet Book01 71

    Chasing symptom events isexpensive; use technology to suppress

    them or correlate them so they are

    related to the underlying cause event.

    74

    I you are not in the business o

    building IT management sotware,

    then dont. Buy of-the-shel toolswith pre-integrated components.

    75

  • 7/31/2019 IT Operations Management Guide

    74/136

    Section V: Architecture

  • 7/31/2019 IT Operations Management Guide

    75/136

    #IT OPERATIONS MANAGEMENT tweet Book01 73

    Section V

    Architecture

    Choosing what, where, and how youmonitor your IT inrastructure will

    determine how well you can meet

    service-level commitments. Good

    monitoring architecture can make

    or break the success o your ITmanagement solution.

  • 7/31/2019 IT Operations Management Guide

    76/136

    Good IT monitoring

    delivers the right

    INFORMATION to the

    right PEOPLE in the

    right CONTEXT.

    76

    Section V: Architecture

  • 7/31/2019 IT Operations Management Guide

    77/136

    #IT OPERATIONS MANAGEMENT tweet Book01 75

    Rely on a common data

    model or collecting

    system and service-level

    metrics. This eliminates

    disagreements over

    whose data is correct.

    77

  • 7/31/2019 IT Operations Management Guide

    78/136

    78Filter events at the point o

    collection to minimize tra c to

    the Operations Bridge. Correlate

    everywhere that you can.

    Use agentless monitoring whenever

    possible to speed time to deployment.

    Use agents to get granular data,especially on system perormance.

    79

    Section V: Architecture

  • 7/31/2019 IT Operations Management Guide

    79/136

    #IT OPERATIONS MANAGEMENT tweet Book01 77

    80I you use multiple databases,

    ederate them to ensure a single

    version o the truth.

    81When considering monitoring

    options, treat virtual servers the

    same as physical servers unless thereis a compelling reason not to do so.

  • 7/31/2019 IT Operations Management Guide

    80/136

    Ensure that your

    monitoring has thereporting capabilities

    you need. It should

    provide inormation

    or all the roles that

    will use it.

    82

    Section V: Architecture

  • 7/31/2019 IT Operations Management Guide

    81/136

    #IT OPERATIONS MANAGEMENT tweet Book01 79

    Some companies send correlated

    events directly to their Service Desk.

    83

    Standardize wherever possible.

    This applies to platorms,

    configurations, databases, aswell as monitoring systems.

    84

  • 7/31/2019 IT Operations Management Guide

    82/136

    Section VI: Perormance Management

  • 7/31/2019 IT Operations Management Guide

    83/136

    #IT OPERATIONS MANAGEMENT tweet Book01 81

    Perormance management extends

    beyond just monitoring. It includesdisciplines such as reporting

    and capacity planning. The data

    sources or perormance are the

    same as or monitoring, but the

    tools are diferent, as are thepeople that use them.

    Section VI

    PerformanceManagement

  • 7/31/2019 IT Operations Management Guide

    84/136

    The network always used to get

    blamed or perormance problems

    now virtualization gets blamed.

    85

    Perormance management is a

    multi-tiered discipline. Retain data &

    tools to support each layer: alerting,reporting, diagnostics, planning.

    86

    Section VI: Perormance Management

  • 7/31/2019 IT Operations Management Guide

    85/136

    #IT OPERATIONS MANAGEMENT tweet Book01 83

    Perormance

    management must be

    business-service-centric,

    include physical and

    virtual environments,

    and deal with dynamic

    provisioning.

    87

  • 7/31/2019 IT Operations Management Guide

    86/136

    In dynamic

    environments

    (like vMotion),

    dynamic threshold-

    based monitoring

    (baselining) & dynamic

    correlation minimizes

    maintenance efort.

    88

    Section VI: Perormance Management

  • 7/31/2019 IT Operations Management Guide

    87/136

    #IT OPERATIONS MANAGEMENT tweet Book01 85

    I you store perormance

    data on a virtual guest,

    make sure you have an

    archive to access this

    inormation i the virtual

    guest disappears.

    89

  • 7/31/2019 IT Operations Management Guide

    88/136

    Consolidate cross-

    domain perormance

    data and response

    time measures with

    appropriate granularity

    to support long-term

    reporting & planning.

    90

    Section VI: Perormance Management

  • 7/31/2019 IT Operations Management Guide

    89/136

    #IT OPERATIONS MANAGEMENT tweet Book01 87

    Use historical perormance

    inormation as the oundation or

    capacity planning. You will be ableto achieve better utilization.

    92

    Retain diagnostic perormancedata, at the collection point

    where possible, or a reasonable

    duration to support triage.

    91

  • 7/31/2019 IT Operations Management Guide

    90/136

    The solution to

    perormance problems

    is not always more

    hardware. Understand

    the cause beore you

    jump to conclusions.

    93

    Section VI: Perormance Management

  • 7/31/2019 IT Operations Management Guide

    91/136

    #IT OPERATIONS MANAGEMENT tweet Book01 89

    Detect application

    availability and

    perormance issues

    proactively. Fix them

    beore your customers

    experience degrades.

    94

  • 7/31/2019 IT Operations Management Guide

    92/136

    Section VII: Virtualization

  • 7/31/2019 IT Operations Management Guide

    93/136

    #IT OPERATIONS MANAGEMENT tweet Book01 91

    Virtualization is one o the

    biggest technology disruptions

    in the past decade. It allowsorganizations to dramatically

    increase the utilization o their

    servers. But this can come

    at a great price i ine cient

    management practices erode thesavings on hardware.

    Section VII

    Virtualization

  • 7/31/2019 IT Operations Management Guide

    94/136

    Virtualization changes

    everything. Adjust

    your core monitoring

    processes accordingly.

    95

    Section VII: Virtualization

  • 7/31/2019 IT Operations Management Guide

    95/136

    #IT OPERATIONS MANAGEMENT tweet Book01 93

    Virtualization is

    dynamic & increases

    the need or accurate,

    timely discovery

    so that monitoring

    activities can keep up

    with the real world.

    96

  • 7/31/2019 IT Operations Management Guide

    96/136

    Virtual servers can causemultiple virtual guests to ail;

    they must be monitored as a

    business critical resource.

    97

    Virtual guests are just servers with

    shared hardware. They can still

    disrupt a business service, so treatthem like any other server.

    98

    Section VII: Virtualization

  • 7/31/2019 IT Operations Management Guide

    97/136

    #IT OPERATIONS MANAGEMENT tweet Book01 95

    Monitoring how

    virtual guests use

    server resources is

    important, but you

    still need to monitor

    inside the guests to

    see why issues occur.

    99

  • 7/31/2019 IT Operations Management Guide

    98/136

    Manage physical and virtual servers

    using the same tools and processes.

    100

    Correlating events across the virtual/

    physical boundary is di cult. Use

    an integrated solution to speedtroubleshooting and cut MTTR.

    101

    Section VII: Virtualization

  • 7/31/2019 IT Operations Management Guide

    99/136

    #IT OPERATIONS MANAGEMENT tweet Book01 97

    102

    Follow the VMs.4As

    virtual machines move to

    a new server, make sure

    the monitoring goes with

    it. Automatically.

    4. VM = virtual machine

  • 7/31/2019 IT Operations Management Guide

    100/136

    104Optimize workload placement

    on virtual machines using both

    historical usage patterns andplanned demand.

    Virtualization helps mostcompanies, not all. I you have

    a compelling reason to use

    physical servers only, do it.

    103

    Section VII: Virtualization

  • 7/31/2019 IT Operations Management Guide

    101/136

    #IT OPERATIONS MANAGEMENT tweet Book01 99

    Oversizing resources

    or a VM can degrade

    VM server perormance.

    (Management overhead

    or larger VMs is higher.)

    105

  • 7/31/2019 IT Operations Management Guide

    102/136

    I you use averages to size and placeyour VMs, you will end up under-

    provisioning; i you use peaks, you

    will end up over-provisioning.

    106

    Choose the best hypervisor or your

    needs. It may be more than one.

    Monitor all using a single, centralconsole to speed troubleshooting.

    107

    Section VII: Virtualization

  • 7/31/2019 IT Operations Management Guide

    103/136

    #IT OPERATIONS MANAGEMENT tweet Book01 101

    Add value in virtualizedenvironments by discovering

    underutilized systems. Shut them

    down and reclaim the capacity.

    108

    Virtual servers are the new

    mainrames. Additional disciplines

    such as capacity planning need tobe employed to manage risk.

    109

  • 7/31/2019 IT Operations Management Guide

    104/136

    Changes are

    inevitable. You needa way to keep your

    service maps current.

    With virtualization,

    automation is the

    only way.

    110

    Section VII: Virtualization

  • 7/31/2019 IT Operations Management Guide

    105/136

    #IT OPERATIONS MANAGEMENT tweet Book01 103

    111

    Virtual server

    monitoring has to be

    integrated into the

    overall monitoring

    solution to

    understand business

    service impact.

  • 7/31/2019 IT Operations Management Guide

    106/136

    Section VIII: Cloud Computing

  • 7/31/2019 IT Operations Management Guide

    107/136

    #IT OPERATIONS MANAGEMENT tweet Book01 105

    Many see cloud computing as the

    panacea or delivering and managingbusiness services. But how do

    you monitor the availability and

    perormance provided by your cloud

    vendor? And, i you are creating a

    private cloud ofering, how do youmanage the inrastructure?

    Section VIII

    Cloud Computing

  • 7/31/2019 IT Operations Management Guide

    108/136

    Cloud computing

    is changing the

    way in which IT

    is built, delivered,

    and consumed.

    112

    Section VIII: Cloud Computing

  • 7/31/2019 IT Operations Management Guide

    109/136

    #IT OPERATIONS MANAGEMENT tweet Book01 107

    The application doesnt know itsrunning in the cloud. Monitor its

    availability and perormance as you

    would in an on-premise environment.

    113

    114

    Monitor cloud services based on howyour users perceive them.

  • 7/31/2019 IT Operations Management Guide

    110/136

    It does not need to

    be a major leap to

    build a private cloud

    environment; you can

    make incremental

    steps starting with

    what you have now.

    115

    Section VIII: Cloud Computing

  • 7/31/2019 IT Operations Management Guide

    111/136

    #IT OPERATIONS MANAGEMENT tweet Book01 109

    I you are considering private

    cloud in the uture, build in usage

    metering and billing today.

    116

    Data center sprawl? You can

    scale using someone elsescapacity with public cloud.

    117

  • 7/31/2019 IT Operations Management Guide

    112/136

    Do worry about what

    is happening inside

    the black box (public

    cloud). Insist on

    some measurements.

    118

    Section VIII: Cloud Computing

  • 7/31/2019 IT Operations Management Guide

    113/136

    #IT OPERATIONS MANAGEMENT tweet Book01 111

    Public cloud /

    Virtualization. Google

    is the best known

    example. Private

    clouds are usually

    highly virtualized.

    119

    /=

  • 7/31/2019 IT Operations Management Guide

    114/136

    120

    Companies that ofer

    cloud computing are

    basically service

    providers. Make sure

    they provide you with

    the service you are

    paying or.

    Section VIII: Cloud Computing

  • 7/31/2019 IT Operations Management Guide

    115/136

    #IT OPERATIONS MANAGEMENT tweet Book01 113

    Trust, but veriy.

    Monitor service

    levels to ensure you

    are getting what

    you paid your cloud

    provider to deliver.

    121

  • 7/31/2019 IT Operations Management Guide

    116/136

    Section IX: Getting Started

  • 7/31/2019 IT Operations Management Guide

    117/136

    #IT OPERATIONS MANAGEMENT tweet Book01 115

    I you are reading this book, youare on the right path to improving

    your IT Operations. Choose one or

    two ideas rom this book that seem

    achievable, and get them done. Track

    the return on your investments.Rinse and repeat.

    Section IX

    Getting Started

  • 7/31/2019 IT Operations Management Guide

    118/136

    Event and perormance management

    only has value i it leads to action.

    122

    Continuing with your status quo

    monitoring will likely ail because

    o the significant structural changeshappening in the IT industry.

    123

    Section IX: Getting Started

  • 7/31/2019 IT Operations Management Guide

    119/136

    #IT OPERATIONS MANAGEMENT tweet Book01 117

    Manage inrastructure

    holistically. Combine

    ault, perormance,

    configuration, and IT

    process automation.

    124

  • 7/31/2019 IT Operations Management Guide

    120/136

    125

    Section IX: Getting Started

    Integrate security

    alerts into your

    operations

    management console.

    Is the CPU spike

    increased by user

    demand or a hacker?

  • 7/31/2019 IT Operations Management Guide

    121/136

    #IT OPERATIONS MANAGEMENT tweet Book01 119

    To start building an

    Operations Bridge,

    consolidate all

    alerts/events into

    one place to provide

    complete visibility o

    the IT environment.

    126

  • 7/31/2019 IT Operations Management Guide

    122/136

    Once all events are

    coming into one

    place, then deploy

    technologies to help

    refine the event stream

    to highlight causal

    (actionable) events.

    127

    Section IX: Getting Started

  • 7/31/2019 IT Operations Management Guide

    123/136

    #IT OPERATIONS MANAGEMENT tweet Book01 121

    Dont allow political agendas ortechnical objections to undermine

    an Ops Bridge deployment. Other

    companies have done it successully.

    128

    Integration is very resource

    intensive, especially as products

    evolve. Make your vendors do thisor you. Keep them honest.

    129

  • 7/31/2019 IT Operations Management Guide

    124/136

    Rip & replace is costly & disruptive.

    Add a new monitoring solution as an

    overlay or manager o managersuntil ready to retire servers.

    131

    What is the true cost o reemonitoring tools? Consider the labor

    cost o configuring & maintaining

    them. 3-year time horizon is typical.

    130

    Section IX: Getting Started

  • 7/31/2019 IT Operations Management Guide

    125/136

    #IT OPERATIONS MANAGEMENT tweet Book01 123

    During ITIL v3 Service

    Strategy phase,

    consider IT monitoring

    processes & tool

    selection rom both

    financial and demand

    management aspects.

    132

  • 7/31/2019 IT Operations Management Guide

    126/136

    Monitoring must be included inthe cost calculations o developing

    new applications. Add this into your

    project budget.

    133

    Section IX: Getting Started

    Operations Bridge staf must

    help create the IT monitoring

    solutiondont build in isolationand throw it over the wall.

    134

  • 7/31/2019 IT Operations Management Guide

    127/136

    #IT OPERATIONS MANAGEMENT tweet Book01 125

    Consider hosted

    monitoring (SaaS)5

    as a way to get the

    unctionality you need

    without installing any

    sotware or buying

    any hardware.

    135

    5. SaaS = Sotware as a Service

  • 7/31/2019 IT Operations Management Guide

    128/136

    I you cant measure it, you cantmanage it. Establish key perormance

    indicators and use the data to drive

    continuous improvement.

    136

    Embrace proven bestpractices to reduce risk.

    137

    Section IX: Getting Started

  • 7/31/2019 IT Operations Management Guide

    129/136

    #IT OPERATIONS MANAGEMENT tweet Book01 127

    I you want to know what to monitor,check your system log files. Set

    thresholds to alert you o issues

    beore they become incidents.

    138

    Have a clear vision o your goal,

    but implement in manageable,measurable, incremental steps.

    139

  • 7/31/2019 IT Operations Management Guide

    130/136

    Consolidation is

    a process. Aim or

    a single central

    console and enjoy

    the incremental cost

    savings as you reduce

    duplication o efort.

    140

    Section IX: Getting Started

  • 7/31/2019 IT Operations Management Guide

    131/136

    129#IT OPERATIONS MANAGEMENT tweet Book01

    About the Authors

    Peter Spielvogel leads the global Product Marketing team or the HP

    Operations Center (ormerly OpenView Operations) product portolio.

    Since starting his career twenty-five years ago developing sotware orfinancial services companies, he has held marketing, sales, and product

    management positions at Fortune 500 companies and several startups.

    Peter is ITIL v3 Foundation certified. He speaks internationally on IT

    Operations topics, including virtualization, automation, cloud computing,

    and consolidated operations. His education includes an MBA rom the

    Tuck School o Business at Dartmouth and a BS in Engineering rom

    Princeton University. He is based in Silicon Valley, Caliornia.Read his blog at www.hp.com/go/ITOpsBlog

    Follow him on Twitter @HPITOps

    Email Peter at [email protected]

  • 7/31/2019 IT Operations Management Guide

    132/136

    Managing Your IT Inrastructure in the Age o Complexity

    Jon Haworth leads Product Marketing or the Service and Operations

    Bridge within the HP Operations Center product portolio. He has twenty-

    five years o experience working or HP across a variety o roles including

    consulting, pre-sales, and marketing. Jon has designed and implemented

    large-scale inrastructure management solutions or a number o Fortune

    1000 enterprises. Jon is an early adopter and continued advocate orITIL having gained his ITIL v2 Service Manager certification in 1996.

    He speaks extensively throughout Europe and Asia on the advantages o

    consolidating IT management. Jon has a BS degree in Computer Science

    rom Manchester University. He is based outside London in the UK.

    Read his blog at www.hp.com/go/ITOpsBlog

    Email Jon at [email protected]

  • 7/31/2019 IT Operations Management Guide

    133/136

    131#IT OPERATIONS MANAGEMENT tweet Book01

    Sonja Hickey leads Product Marketing or the instrumentation product

    lines within the HP Operations Center product portolio. She has twenty

    years o product marketing, product management, engineering, and

    consulting experience with privately-held, startup, and Fortune 500

    companies. Sonja is ITIL v3 Foundation certified. She speaks requently

    throughout the U.S. about IT management best practices. Sonjaseducation includes an MBA rom the University o Chicago GSB and BS

    and MS degrees in Engineering rom the University o Illinois at Urbana-

    Champaign. She is based near Chicago, Illinois.

    Read her blog at www.hp.com/go/ITOpsBlog

    Follow her on Twitter @HPITOps

    Email Sonja at [email protected]

    The authors are donating all their royalties to the HP Foundation, which

    unds global disaster relie eforts. The publisher is matching their

    contribution dollar or dollar.

  • 7/31/2019 IT Operations Management Guide

    134/136

    Managing Your IT Inrastructure in the Age o Complexity

    Other Books in the THiNKaha Series

    The THiNKaha book series is or thinking adults who lack the time ordesire to read long books, but want to improve themselves with knowledgeo the most up-to-date subjects. THiNKaha is a leader in timely, cutting-edge books and mobile applications rom relevant experts that providevaluable inormation in a un, Twitter-brie ormat or a ast-paced world.

    They are available online at http://thinkaha.com or at other online andphysical bookstores.

    #BOOK TITLE tweet Book01:140 Bite-Sized Ideas or CompellingArticle, Book, and Event Titles by Roger C. Parker

    #COACHING tweet Book01:140 Bite-Sized Insights On Making ADiference Through Executive Coaching by Sterling Lanier

    #CONTENT MARKETING tweet Book01:140 Bite-Sized Ideas to Createand Market Compelling Content by Ambal Balakrishnan

    #CORPORATE CULTURE tweet Book01:140 Bite-Sized Ideas toHelp You Create a High Perorming, Values Aligned Workplace thatEmployees LOVE by S. Chris Edmonds

    #CROWDSOURCING tweet Book01:140 Bite-Sized Ideas to Tap into theWisdom o the Crowd by Kiruba Shankar and Mitchell Levy

    #DEATHtweet Book01:A Well-Lived Lie through 140 Perspectives onDeath and Its Teachings by Timothy Tosta

    #DEATH tweet Book02:140 Perspectives on Being a Supportive Witnessto the End o Lie by Timothy Tosta

    #DIVERSITYtweet Book01:Embracing the Growing Diversity in OurWorld by Deepika Bajaj

    #DREAMtweet Book01:Inspirational Nuggets o Wisdom rom a Rockand Roll Guru to Help You Live Your Dreams by Joe Heuer

    #ENTRY LEVEL tweet Book02:Inspiration or New Proessionals byChristine Ruf and Lori Ruf

    #ENTRYLEVELtweet Book01:Taking Your Career rom Classroom toCubicle by Heather R. Huhman

    #IT OPERATIONS MANAGEMENT tweet Book01:Managing YourIT Inrastructure in The Age o Complexity by Peter Spielvogel, JonHaworth, Sonja Hickey

    #JOBSEARCHtweet Book01:140 Job Search Nuggets or ManagingYour Career and Landing Your Dream Job by Barbara Saani

    1.

    2.

    3.

    4.

    5.

    6.

    7.

    8.

    9.

    10.

    11.

    12.

    13.

  • 7/31/2019 IT Operations Management Guide

    135/136

    133#IT OPERATIONS MANAGEMENT tweet Book01

    #LEADERSHIPtweet Book01:140 Bite-Sized Ideas to Help You Becomethe Leader You Were Born to Be by Kevin Eikenberry

    #LEAN SIX SIGMA tweet Book01:Business Process Excellence or theMillennium by Dr. Shree R. Nanguneri

    #LEAN STARTUP tweet Book01:140 Insights or Building a LeanStartup! by Seymour Duncker

    #MILLENNIALtweet Book01:140 Bite-Sized Ideas or Managing the

    Millennials by Alexandra Levit#MOJOtweet:140 Bite-Sized Ideas on How to Get and Keep Your Mojoby Marshall Goldsmith

    #OPEN TEXTBOOK tweet Book01:Driving the Awareness andAdoption o Open Textbooks by Sharyn Fitzpatrick

    #PARTNER tweet Book01:140 Bite-Sized Ideas or Succeeding in YourPartnerships by Chaitra Vedullapalli

    #PRESENTATION tweet Book01:140 Ways to Present with Impact byWayne Turmel

    #PRIVACY tweet Book01:Addressing Privacy Concerns in the Day oSocial Media by Lori Ruf

    #PROJECT MANAGEMENT tweet Book01:140 Powerul Bite-SizedInsights on Managing Projects by Guy Rale and Himanshu Jhamb

    #QUALITYtweet Book01:140 Bite-Sized Ideas to Deliver Quality inEvery Project by Tanmay Vora

    #SOCIAL MEDIA PR tweet Book01:140 Bite-Sized Ideas or SocialMedia Engagement by Janet Fouts

    #SOCIALMEDIA NONPROFIT tweet Book01:140 Bite-Sized Ideas orNonprofit Social Media Engagement by Janet Fouts with Beth Kanter

    #SPORTS tweet Book01:What I Learned rom Coaches About Sports andLie by Ronnie Lott with Keith Potter

    #STANDARDS tweet Book01:140 Bite-Sized Ideas or Winning theIndustry Standards Game by Karen Bartleson

    #TEAMWORK tweet Book01:Lessons or Leading Organizational Teamsto Success 140 Powerul Bite-Sized Insights on Lessons or LeadingTeams to Success by Caroline G. Nicholl

    #THINKtweet Book01:Bite-Sized Lessons or a Fast Paced World byRajesh Setty

    14.

    15.

    16.

    17.

    18.

    19.

    20.

    21.

    22.

    23.

    24.

    25.

    26.

    27.

    28.

    29.

    30.

  • 7/31/2019 IT Operations Management Guide

    136/136