a blueprint for smarter storage management

12
IBM Global Technology Services Thought Leadership White Paper October 2011 A blueprint for smarter storage management Optimizing the storage environment with automation and analytics

Upload: ibm-india-smarter-computing

Post on 20-May-2015

190 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: A blueprint for smarter storage management

IBM Global Technology Services

Thought Leadership White Paper

October 2011

A blueprint for smarter storage managementOptimizing the storage environment with automation and analytics

Page 2: A blueprint for smarter storage management

2 A blueprint for smarter storage management

Contents

2 Introduction

3 The basics of optimal storage management

4 Making smarter storage management a reality

8 The bottom line: Labor and infrastructure savings

9 IBM can help

IntroductionThe skyrocketing demand for data storage capacity is a major driver of escalating IT expense. The costs associated with storage growth include not just additional or upgraded devices, but also data center floor space, electricity, HVAC and ongoing system management.

The addition of storage capacity continues to grow exponen-tially, even though most organizations—due to over-provisioning and poor visibility into their storage environment—use no more than 30 to 40 percent of their available storage.

There are several factors contributing to this paradox. The perception is that “storage is cheap,” so storage requests are routinely honored without challenge. Individuals who have the responsibility for storage performance or access to data are sensitive to performance issues and reluctant to risk degradation of service for want of high-tier storage.

Finally, a common contributor to the usage paradox is that, because the provisioning process for storage can be labor intensive and time consuming, application owners and other consumers of storage tend to over-request. The thinking seems to be that they can save lead time and paperwork by requesting more storage less frequently and letting it sit until needed.

Big data: The big challenge

Adding to the storage dilemma is the exponential growth in data that must be stored. Every day, 2.5 quintillion bytes of new data are generated. In fact, 90 percent of the data in the world today has been created in the last two years alone. This “big data” is being generated by billions of devices—from sensors used to gather climate information to GPS chips in smart phones—as well as posts to social media sites, digital pictures and videos posted online, Internet text and docu-ments, medical records, and transaction records such as online purchases and cell phone call data.

Big data spans three dimensions: variety, velocity and volume.

Variety: Big data extends beyond structured data to include unstructured data of all varieties: text, audio, video, click streams, log files and more.

Velocity: Often time sensitive, big data must be used as it is streaming into the enterprise in order to maximize its value to the business.

Volume: Big data comes in one size: Huge. Enterprises are awash with data, easily amassing terabytes and even petabytes of information.

Page 3: A blueprint for smarter storage management

3IBM Global Technology Services

A word about tiering

Not all data is created equal: There are many types in a typical IT environment, and its value fluctuates during its lifecycle. For example, email is highly critical initially, but its value often drops rapidly. Project data files may be less critical at any given moment but remain important longer.

Tiering—an underlying principle of information lifecycle management—is a storage networking method where data is stored on various types of media based on performance, availability and recovery requirements (see Figure 2). In general, newer data and data that must be accessed more frequently is stored on faster but more expensive storage media, while less critical data is stored on less expensive but slower media. Data intended for restoration in the event of data loss or corruption could be stored locally—for fast recovery—while data stored only for regulatory purposes could be archived to lower-cost media. A tiered storage infrastructure can consist of as few as two to as many as five or six tiers.

The basics of optimal storage managementAs the demands on IT continue to grow, organizations face significant challenges around storage growth, continued cost pressures and complexity of available technologies. Best practices for storage optimization begin with the following principles:

●● Store only what is needed and only for as long as it needs to be stored. To accomplish this many organizations apply data reduction technologies such as data compression and de-duplication, and demand management processes.

●● Get more out of existing storage infrastructure through virtualization, thin provisioning, consolidation and proper monitoring.

+

+

Zettabytes

Exabytes

Petabytes

Terabytes

Gigabytes

2000 2005 2010 2015

• Information doubling every 18 - 24 months

• Storage budgets up 1% - 5% in 2010

• Storage utilization remains <50%

The information explosion meets budget reality

Instrumented.Interconnected.

Intelligent.

• Storage requirements growing 20% - 40% per year

Figure 1: Smarter systems—instrumented, interconnected, intelligent— are creating a data explosion. The digital universe is projected to grow from 1.8 zettabytes in 2011 to 72 zettabytes by 2015.

TIER

TIER 0Solid state drives only

TIER 1

TIER 2

TIER 3

TIER 4

CO

ST

/ P

ER

FO

RM

AN

CE

/ A

VA

ILA

BIL

ITY

Description

Ultra-high performance. Meet QoS for high-end, mission-critical applications.

High performance and/or availability. Drive up utilization of high-ended storage subsystems and still maintain performance QoS objectives.

For low-capacity requirements, smaller, less powerfuldevices may meet tier definition.

For low-capacity requirements, smaller, less powerfuldevices may meet tier definition.

Low performance and/or availability. Non-mission-critical applications.

Archival, long-term retention; backup.

Medium performance and/or availability. Revenue-generating applications. Meet QoS for non-mission- critical applications.

Figure 2: Tiering—an underlying principle of information lifecycle management—is a storage networking method where data is stored on various types of media based on performance, availability and recovery requirements.

Page 4: A blueprint for smarter storage management

4 A blueprint for smarter storage management

●● Data should be moved to the “right” place—and done so on an ongoing basis. Data often loses value over its lifecycle—sometimes quickly—creating opportunities to optimize by moving data from expensive disk to lower-cost disk (see Figure 3). Even active data may have a requirements pattern that changes periodically (for example, the data associated with some quarter-close applications may only have high perfor-mance requirements for a few weeks each quarter). Having the ability to move such data up and down the disk hierarchy as necessary can allow for a more optimal storage hierarchy supporting an organization’s data requirements.

Prior Storage Pyramid New Storage Pyramid

Tier 0 Tier 0

Tier 1 Tier 1

Tier 2 Tier 2

Tier 3 Tier 3

+

--

$

1-5%

15-20%

20-25%

50-60%

Figure 3: Most companies, due to over-provisioning and poor visibility into their storage environment, have too much data on Tier 1 storage. A more balanced distribution across all tiers can improve application performance and data availability and help lower storage portions of the IT budget.

Using these principles as a foundation for storage management practices, organizations can be prepared to take a smarter approach to storage management which addresses exploding storage growth and costs.

What follows is a discussion of IBM’s actionable approach to smarter storage management.

Smarter storage management:

• Reduces complexity while preserving storage infrastructure flexibility

• Governs both supply and demand to minimize custom solutions and reactive work and drives this governance through service automation

• Uses analytics to infuse intelligence into tools and automa-tion for workflow, migration and provisioning to achieve operational efficiency

• Frees up staff to focus on more critical projects

Making smarter storage management a realityIBM’s smarter approach to storage management includes a set of tools, services, and software and hardware technology to help clients realize cost-saving opportunities in storage. By pulling together virtualization, a storage service catalog driven by busi-ness policy, workflow automation and IBM Research-developed solutions, IBM can help clients optimize the storage environment.

IBM’s smarter storage management approach comprises three essential elements to making smarter storage management a reality within a storage environment:

Essential element #1: Create a responsive infrastructure across multiple vendor storage assets and tiers of storage. A responsive infrastructure reduces complexity and lowers overall costs of storage while preserving flexibility. Storage virtualization improves the utilization and efficiency of the storage hardware resources. Optimizing the environment through virtualization

Page 5: A blueprint for smarter storage management

5IBM Global Technology Services

generally results in fewer storage components that need to be managed, making it easier to monitor and protect critical data. A virtualization effort in the storage environment can decrease complexity and free up resources, as well as reduce costs.

In a virtualized storage environment, multiple storage devices function as a “virtual” single storage unit, making tasks such as provisioning or placement, application migration, tier migration, replication and archiving easier and faster. Storage assets that previously had no common interface can be used interchange-ably. Among other things, that interface enables data to be moved to less expensive tiers of heterogeneous storage in the virtualized storage environment as the data ages without inter-ruption to the business.

Getting more utilization from the existing storage assets is one of the best ways to manage rapid information growth and big data more effectively. Improving the utilization of existing storage is preferable to adding physical storage devices, because in addition to saving the hardware costs, this approach doesn’t require more floor space and associated data center and energy costs. IBM storage virtualization can improve utilization up to 30 percent across both IBM and non-IBM storage, while improving administrator productivity.

Over the next decade:

•The number of servers (virtual and physical) worldwide will grow by a factor of 10.

•The amount of information managed by enterprise data centers will grow by a factor of 50.

•The number of files the data center will have to deal with will grow by a factor of 75 or more.

Meanwhile, the number of IT professionals in the world will grow by a factor of less than 1.5 1

IBM’s approach to virtualization is:

●● Vendor neutral: It provides integration and a single point of control for more than 120 multi-vendor storage systems.

●● Reliable: It uses standards and repeatable processes with the latest IBM virtualization technology and products.

●● Proven to deliver cost savings: In one example, a virtualiza-tion effort for an IBM client reduced annual KW hours by 3,565,320; recovered 1,700 square feet of floor space; produced annual energy savings of US$320,878 and reduced annual maintenance costs by 57 percent.

Essential element #2: Standardize storage usage and process holistically for all data. Improving the way storage is used begins with changing the behavior around how it is requested. Standardization minimizes reactive work and custom solutions and lays the foundation for workflow automation. Standardization, implemented through a storage service catalog, ensures a set of standards that reduces manual intervention and decision making (which often results in data being placed on a higher tier of storage than needed). Standardization includes policies for correct size, initial class of service and management overtime. It also addresses the storage request process, stream-lining it and continuously driving these standards and policies into the request and provisioning process.

IBM’s patent-pending intelligent storage service catalog (ISSC) promotes more efficient storage allocation and governance (“supply and demand”) by establishing standards for storage consumption that can be used to optimize provisioning, backup, replication and archiving.

Simplifying the request by asking the right business-oriented questions up front in the process, and using established standards in the balance, helps drive more cost-effective data management overtime. End users are no longer asked to specify the many parameters associated with each storage request: Array, disk size and type, RAID configuration, stroked drives, and so on.

Page 6: A blueprint for smarter storage management

6 A blueprint for smarter storage management

Instead of asking the storage user to define the specific require-ments, the questions are structured to drive the conversation toward purpose: What are you trying to accomplish? What kind of data are you creating? This is easy for the requesters, because they know if they are creating, for instance, emails, video files, images, user documents, or transactional databases or develop-ment code. They require storage for the purpose of housing data. That’s important, because each client has unique business requirements for each type of data; but all of those data types are relatively standard across any business.

By defining and codifying business value and requirements for data once, then mapping these requirements to infrastructure, data types can be used over and over to properly request storage and manage storage demand.

The intelligent storage service catalog:

●● Optimizes and simplifies storage requests to help reduce over-provisioning and the need for highly skilled personnel

●● Enables more consistent storage processes and governance by defining standards and policies

●● Offers policy-based storage management to enable and enhance information lifecycle management

●● Lays a foundation for automation

The concept of defining policies and parameters once, then reusing them for all projects, instead of defining at the beginning of every request for service is at the core of IBM’s smarter storage management. Utilization of existing storage is shown to increase as much as 50 percent with correct size and place-ment. In addition, provisioning is more efficient and delivers requested storage in less time than with traditional processes, thus reducing or eliminating the tendency to over-provision.

Because adding storage is perceived to be relatively affordable as part of an overall IT budget, the tendency of vendors has been to propose Tier 1 solutions even when it is not specifically necessary. In these cases, migration to lower tiers (rebalancing)

is sometimes difficult. Use of a storage service catalog helps reduce or eliminate over-provisioning at the Tier 1 level by using preset standards that permit the use of less expensive tiers during the request stage.

Figure 4 illustrates this approach as used in a smarter storage environment. During an ISSC engagement, data is categorized into specific types and dropped into the correct “bucket.” Policies and other criteria are used to size and place the data at the correct tier. Later in the lifecycle of the data, it might be archived, saved to tape or deleted from storage entirely as its aged importance suggests.

Figure 4: Predetermined policies for the management of data are key to the ILM concept and can increase storage utilization up to 50 percent.

Use policies, analytics and automation to size and place data onto correct tiers at the request

Archive Tape

to move data as it changes value

TapeArchiveTier 3Tier 2Tier 1

Use policies, analytics and automation

Page 7: A blueprint for smarter storage management

7IBM Global Technology Services

Developing the intelligent storage service catalogTo develop the intelligent storage service catalog, IBM works with the client’s storage managers, architects and subject matter experts to:

●● Replace manual allocation decisions with standardized policies. The IBM team develops and enables the use of predefined requirements and architectures with respect to storage demand and data management. The goal is to make it possible for the customer to “Define once, execute repeatedly.”

●● Break down and evaluate the value of the data types to the business. A representative set of applications are broken down into holistic data types. The team captures business requirements or key performance indicators for these data types. Looking at applications holistically through common types of data and defining requirements formally once are unique to the IBM approach.

●● Define a matching catalog of services and technologies. The client works with the IBM team to design business request logic to change the way storage is requested, making it purpose driven.

●● Simplify user requests so that type and quantity of data is the only input provided by the requester. This framework allows the IBM team to standardize storage provisioning and engender a business valuation of data into data management from inception until disposal.

Essential element #3: Intelligently automate data movement and decision making across the data center. Automation, with respect to storage management, can occur in several ways. A data type-based request results in streamlined and purpose-driven storage request workflow automation. Intelligent automation assesses the existing environment to ensure the storage environment grows in a workload-optimized

manner. This allows a business to achieve operational efficiency and to reduce labor cost and risk. IBM is focused on all of these aspects of automation.

For intelligent automation, IBM’s capabilities leverage IBM Research-driven analytics for policy creation and automa-tion. Beginning with the data types defined in the catalog, user input is simplified. At this point, the IBM team has determined the correct size and tier on which to place the data.

In the next step—and this is another area in which IBM capabili-ties differ from other storage management models—the intelli-gent storage placement manager (ISPM) uses historical and current performance data to automatically configure and provi-sion on most cost- or performance-effective storage devices. ISPM initially analyzes the disk pools of a respective tier and intelligently and automatically provisions storage on the most effective storage devices. ISPM can provision, de-provision and create volumes and virtual disks and do host mapping.

Another IBM Research-developed technology, intelligent stor-age tier manager (ISTM) recommends best migration targets and windows overtime as a function of workload and access requirements. ISTM analyzes and balances performance and cost throughout the life of data. For instance, highly accessed data is “cached” on high performance storage when needed, and rarely accessed or old data is archived on cheaper storage as it loses value to the business.

These patent-pending analytic tools create an environment that is responsive to the business value of the data. Data placement is not only workload optimized when it is new but across its lifecycle according to the business rules associated with the data.

Page 8: A blueprint for smarter storage management

8 A blueprint for smarter storage management

The bottom line: Labor and infrastructure savingsThe implementation costs for a storage project vary widely depending on the amount of automation involved at the coordi-nation and execution layers. In IBM’s own experience with both internal projects and client deployments, standardization alone can reduce the effort by 50 percent. Automation can reduce execution hours by as much as 90 percent.

In a traditional storage implementation scenario, the storage architects and the requestors (typically application architects) meet several times to capture all of the business and storage requirements for a given project. By contrast, under smarter storage management, storage requirements, services and tech-nologies are pre-defined, allowing the application owners to simply select data types from the catalog, and the rest is driven to standardized solutions downstream.

Create a responsive, business oriented infrastructure.

Standardize storage usage and process.

Automate data movement and decision making.

Goal: Lower overall cost of storage while preserving flexibility.

HOW?

• Implement a virtualized multi-tier infrastructure.

• Deploy thin provisioning and de-duplication.

Goal: Minimize reactive work and custom solutions while freeing-up highly skilled resources.

HOW?

• Standardize the storage request process to reduce planning and delivery time.

• Standardize and operationalize provisioning by data type.

• Correctly size and place data (on the right tier) from the start.

Goal: Achieve operational efficiency, and reduce labor cost and risk through intelligent automation.

HOW?

• Automate storage request workflow.

• Automate storage provisioning and workload analysis.

• Automate tier movement within a storage array.

• Automate policy-driven tier movement across arrays.

Figure 5: The dashed line around the infrastructure arrow (left) and standardization arrow (center) indicates that these steps can happen either simultaneously or serially as dictated by the needs of the organization. Together they lay the foundation for intelligent automation, as indicated in the arrow to the right.

Page 9: A blueprint for smarter storage management

9IBM Global Technology Services

In a recently conducted two-stage pilot for a client, IBM used research-developed tools to automatically rebalance five tera-bytes of data based on administrator policies. The result was that a two-to-three-day process was reduced to two to three hours. In stage two, IBM automatically moved 57 terabytes of data overnight, with no failures. This right-tiering initiative currently produces US$21,000 per month in savings by using lower-tiered storage instead of higher tiers. Based on this client’s enterprise total storage volume of 600 terabytes, the savings could be extended to US$2.6 million a year at full implementation.

IBM can helpIBM has a long history in information management. Today, IBM continues to be a leader in information lifecycle manage-ment, offering comprehensive solutions that drive business results and encompass complementary hardware, software and services. IBM storage systems, with enterprise-class disk and tape storage tiers, offer best-in-class virtualization and drive increased return on investment. IBM Research is developing significant new ILM tools that will provide IBM an end-to-end approach—hardware, software and services solutions—unequaled in the marketplace. Additionally, IBM has patent-pending ILM accelerators that deliver proven, repeatable techniques for optimizing storage and information management environments. Finally, IBM software has the capacity to enable end-to-end management featuring advanced virtualization, orchestration, automation and robust information management capabilities.

In 2009, the National Football League (NFL), the largest professional American football league in the world, ap-proached IBM to help reduce its overall IT infrastructure expense while enhancing storage capabilities.

The IBM storage and data services team provided the intelligent storage service catalog (ISSC) solution. The IBM team performed a business impact analysis, built the request logic behind the process and then helped to deliver a storage catalog framework that streamlines the request process from NFL departments and teams. Once integrated into a service request tool, the ISSC can save many hours of time for the IT organization when responding to storage requests and will also enable the IT department to recover its storage costs through an automated charge-back system. The standard practices implemented by the IBM team can be adhered to regardless of the storage vendor.

The storage catalog service and the cost-recovery system, designed by IBM, will allow the league’s management team to better understand and monitor the volume and worth of the storage services provided to the league’s departments and customers. The cost-recovery tool quantifies the expense of the storage services provided by the NFL’s IT department, allocating those expenses back to the groups that use storage. The league’s catalog application will be integrated into a service request management tool and is being used as a model for additional cost-recovery initiatives.

As an added benefit, IBM elevated the role of IT within the league by providing tangible benefits, cost savings opportuni-ties, and enhanced user services, as well as the charge-back system for storage usage.

Page 10: A blueprint for smarter storage management

10 A blueprint for smarter storage management

Sprint—owner and operator of two wireless communications networks and an internet backbone—wanted to develop an information lifecycle framework that included categorization and retention capabilities. In doing so, Sprint needed a methodology and roadmap for information storage and deployment—in short, a system that would help it meet regulatory, legal and business standards of information retention. Information accessibility and availability were important, as were the framework’s alignment with overall business operations and strategic goals.

Sprint’s existing hardware and software technologies were insufficient for the job. Outdated hardware limited the company’s ability to store data efficiently. Software in use proved inadequate to manage the storage.

Sprint partnered with IBM to develop a solution. The IBM team leveraged best-of-breed practices and methodologies available through IBM Information Lifecycle Management (ILM) services. In doing so, the consulting capabilities of ILM Lifecycle Management—Integrated ILM Services and Information Lifecycle Management—Archiving and Retention Services proved valuable.

IBM helped Sprint identify potential storage efficiency improvements, proposing a storage architecture built around a set of service classes and storage tiers to be enabled by recommended key technologies and tools. IBM consultants developed a data and information lifecycle framework that includes the categorization of information and the time period for retention. IBM also built a Data-Information-Functionality-Usability matrix for network data and information; architected a high-level storage infrastructure; provided a methodology and roadmap for information retention and data lifecycle management; and generated a business case that highlighted the potential financial impact of implementing IBM’s recommendations.

Partnering with IBM, Sprint has begun revamping its informa-tion lifecycle architecture to meet regulatory standards and business goals. In doing so, the company hopes to realize an ROI of up to 117 percent.

Page 11: A blueprint for smarter storage management

11IBM Global Technology Services

IBM is of course experiencing the same big data challenges as its customers, with similar increased demands on existing storage infrastructure. Several initiatives are currently in prog-ress to bring costs into better alignment with the total IT spend. First, an archiving initiative was undertaken to address the Tier 1 storage growth of 30 percent per year. A highly scal-able and reliable file system service was developed to enable archiving to the lowest tier, resulting in an estimated savings of US$2.1 million a year. Second, consolidation of backup and block storage has helped reduce costs, increase utilization, and cut provisioning time from months to days, for an estimated annual savings of US$50 million a year.

Future initiatives include enablement of self-service provision-ing for the storage cloud and automated policy-based tiering for email and other data.

For more informationTo learn more about how IBM can help you derive maximum business value through storage optimization, please contact your IBM marketing representative or IBM Business Partner, or visit the following website: ibm.com/services

Page 12: A blueprint for smarter storage management

© Copyright IBM Corporation 2011

IBM Global Services New Orchard Road Armonk, NY 10589 U.S.A.

Produced in the United States of America October 2011

IBM, the IBM logo, and ibm.com are trademarks of International Business Machines Corporation in the United States, other countries or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the web at “Copyright and trademark information” at ibm.com/legal/copytrade.shtml

Other company, product or service names may be trademarks or service marks of others.

References in this publication to IBM products and services do not imply that IBM intends to make them available in all countries in which IBM operates.

1 IDC Digital Universe Study, sponsored by EMC, June 2011

Please Recycle

SDW03023-USEN-01