unlock big data - h3c · business white paper | hp big data infrastructure consulting table of...

16
Business white paper Unlock Big Data HP Big Data Infrastructure Consulting

Upload: others

Post on 19-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

Business white paper

Unlock Big DataHP Big Data Infrastructure Consulting

Page 2: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

Business white paper | HP Big Data Infrastructure Consulting

Table of contents

3 Executive summary

3 The importance of Big Data

4 Big Data and the potential for big changes

5 Big Data and your infrastructure

9 A strategy that covers both

10 HP HAVEn Platform: extracting value from Big Data

16 Conclusion

16 HP Technology Services Consulting—a perfect fit for Big Data

Page 3: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

3

Business white paper | HP Big Data Infrastructure Consulting

A strategy to unlock Big Data

Unlocking the potential of Big Data requires a well-crafted solutions strategy based on the merits of both analytics and infrastructure. We can help you devise the right strategies to seize the big opportunities with Big Data.

Executive summary

To unlock the potential of Big Data, you need a strategy that considers more than just the values of analytics; it’s also about your infrastructure. When you bring both your business and analytics plan and infrastructure blueprint together, you’re in a better position to seize Big Data opportunities. And, you can start your Big Data planning with either your business or technology teams. If IT is driving your Big Data solution, then let this be your starting point, and work the business scenarios into your strategy. If one or more business units are driving Big Data initiatives, then you can bring IT into the solution development process to help ensure that their technical strategy aligns with your business-driven strategy.

The importance of Big Data

Despite recent headlines and the introduction of cool new technologies, Big Data challenges are not new. Nearly a half century ago, Cold War tensions forced the U.S. Air Force to use electronic means to capture and locate potentially hostile surface-to-air missile sites. Specially equipped planes were tasked to fly through suspicious regions in order to collect and record data in subsecond intervals up to a maximum 10 MB tape capacity. The tape was processed by machines (only 64K–128K of RAM), and took IT from 2 to 6 hours to process the data. Although painfully slow by today’s computing standards, the surveillance data helped keep planes out of harm’s way, and saved the lives of countless crew members.

Today, data and information is still at the core of any organization. The dramatic increase in volume, velocity, variety, and vulnerability of information is transforming businesses and governments. The Big Data challenge has reached enormous scale as about 2.5 quintillion bytes of data are generated around the world each day. Fortunately, advances in technology are making it possible to not only process massive amounts of data, but also to be able to handle data in structured, semi-structured, and unstructured formats. This capability is allowing organizations to expand their businesses, create more efficient operations, get deeper insights into customer requirements, and even strengthen military intelligence. But, like any other major IT initiative you tackle, harnessing Big Data takes time, planning, and a winning strategy.

According to research commissioned on behalf of HP, nearly 60 percent of companies surveyed will spend at least 10 percent of their innovation budget on Big Data this year.1 The study also found, however, that more than one in three organizations has failed with a Big Data initiative. HP’s enhanced portfolio delivers the necessary services and solutions to facilitate the successful implementation of these initiatives, and to enable enterprises to handle the growing volume, variety, velocity, and vulnerability of data that can cause these initiatives to fail.

1 “Big Data and Cloud,” Coleman Parkes Research, Ltd., May 2013

Page 4: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

Data value------------------------- = ROI*

Total cost

Where:

Intelligent information insight = volume & variety * depth of analytics * # of users

Data value = (intelligent information insight) / time to value

Total cost = CAPEX + OPEX

* HP has services that can help you determine your return on investment (ROI).

4

Business white paper | HP Big Data Infrastructure Consulting

Big Data and the potential for big changes

As previously mentioned, volume, variety, and velocity are among the ingredients you need to form a successful Big Data solution. You need the ability to process massive quantities of different types of data, and the data processing must be lightning fast. But, there’s also a fourth ingredient that’s vitally important, and that’s value. The value of Big Data is realized only when harvested data offers deep insights into your business, and decisions can be made using this data. The reason why so many organizations are advancing Big Data initiatives is because they see the potential for big changes in their organizations.

To reap Big Data’s rewards, the data must be analyzed. This involves the discovery and recognition of data patterns or trends and taking action on those patterns. Talented professionals mine the data, determine patterns, and apply “what-if” type calculations. This could give a business manager, for example, predictive actionable insights that might be useful for making better, more informed decisions.

Another example of making use of patterns and insights can be drawn from the healthcare industry. A doctor might want to monitor the cell phone usage patterns of a patient with irregular heartbeats to determine if there’s a correlation, or perhaps even cause and effect, between the patient’s usage patterns and the onset of irregular heartbeats.

Similar monitoring has been ongoing in universities to predict the likely success of a student based on their grades and use of libraries and labs. The universities can then remind students of learning opportunities and offer them options such as tutoring.

Internet searches can reveal numerous opportunities that have been uncovered by processing massive amounts of data. The important part is determining the value you can gain based on your use cases and Big Data solutions.

Use cases are essential to your Big Data planningYou have the right technology and the right people in place, now it’s time to come up with the right plan. Your plans specify what data will be analyzed and how it will be applied to your organization. In short, you need to define a use case that delivers the highest return on your Big Data technology investment. Your Big Data ROI formula may look like the one illustrated in figure 1.

Figure 1. Big Data ROI formula

Once you’ve been able to capture, process, and explore massive amounts of data, it’s possible to uncover new marketing opportunities and even new businesses. However, in other instances, Big Data’s real value may not be so immediate. For Big Data to deliver on its potential, your plan of attack requires that you fully understand the role your infrastructure components play in your solution.

Page 5: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

Big Data

ApplicationsHP OperationAnalytics

HP AutonomyPromote

Big Data

Platform

Big Data

Infrastructure

Big Data Services

Big DataPolicy and Procedure

HP Service Automation

HP - CMU

HP Data Protector

users cloudmobile

HP HAVEn HP Partners

HP ConvergedSystem

SAP-HANA

Microsoft PDW

IDOL

Network Big Data Storage Server

HP FlexFabricHP StoreOnce

HP StoreAllHP StoreEver

HP ProLiantDL/EOL/SL/MLHP Moonshot

5

Business white paper | HP Big Data Infrastructure Consulting

Big Data and your infrastructure

The importance that data analysis plays in harnessing Big Data is well known, but less understood is the importance of your infrastructure to successfully extricate the analytics. When thinking about an ideal technical platform, the following components should be integrated into your existing data center, whether premise based or in the cloud:

• Servers

• Network

• Storage

• Management

• Software (addressed later in this paper)

Figure 2. HP HAVEn Big Data platform map

Appropriate access to the platform must be provided, and existing or new policies, such as compliance, must be applied to the data that resides in the platform. As with other technology-based solutions, a Big Data technology solution must behave like any other tenant in your data center. A right-sized infrastructure offers the scalability, performance, agility, and speed necessary to deliver Big Data services, and its overall importance to your planning and strategy can’t be overlooked. The HP HAVEn Big Data platform is shown in figure 2.

Similarly, technical innovation within each component also has an impact on how you address your infrastructure to support a Big Data solution. Let’s look at how each infrastructure component impacts your Big Data implementation.

“In 2012 the cost ratio of delivering Big Data services compared to using software to manage and analyze Big Data, was almost 15 to 1.”—“Top 10 Technology Trends Impacting Information Infrastructure,” Gartner, 2013

Page 6: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

6

Business white paper | HP Big Data Infrastructure Consulting

ServersMultiple server technologies can be applied to a Big Data solution. The choice comes down to speed of processing, volume of data, price, and, potentially, data center standards. It’s important to note that “speed” is relative and is often replaced by the term “real time.” Unfortunately, “real time” is also a relative term. If you process a report or query in two hours, and tomorrow you can complete the task in 10 minutes, then this becomes your “real-time” standard measurement. For other applications and use cases, “real time” may actually reflect a seconds or subsecond process. These definitions become important when evaluating and selecting your Big Data technologies, including the role that servers play.

Server strategies for Big Data are generally swayed by the software used to process the data. The server architecture needs to be able to handle both the volume and velocity required to process the data. Typical Big Data volumes reach multiple terabytes (TB), petabytes (PB), or even larger datasets. There are three popular approaches today: scale-out, scale-up, and virtualization.

Scale-out: Scale-out technologies, such as, the HP Vertica Analytics Platform, Hadoop, and Microsoft® Parallel Data Warehouse (PDW), are designed to distribute the compute power across numerous servers, each typically having local storage. Increasing capacity is as simple as adding more nodes to the environment. The use of local storage means that each server typically processes data locally, versus accessing data on remote storage.2 Also, local storage is less expensive than shared storage technologies and typically, as in the case of Hadoop, distribute data to multiple servers to ensure availability and redundancy.

The topic of “commodity-class” servers often comes up when talking about scale-out architectures, specifically Hadoop. Keep in mind that Hadoop originated from the large provider/open source space. Development originated with companies like Google™ and Yahoo, where thousands of servers running on scale-out architectures such as Hadoop could be seen. Losing one server from a 1,000-node cluster has far less impact on the overall performance of the cluster than an enterprise losing a single node from a 20- to 40-node cluster. The point here is that commodity-class servers may or may not be appropriate for an enterprise implementation of scale-out Big Data technology.

Another factor to keep in mind with scale-out computing is the floor space and power consumption required. Emerging compute technologies such as systems-on-a-chip (SoCs), as supplied by HP through its Moonshot servers, are aimed specifically at addressing these requirements and will likely play a part in the Big Data discussion in the coming months and years.

2 Hadoop, as an example, can access remote data if necessary. However, under normal conditions it will work against locally stored data.

Page 7: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

7

Business white paper | HP Big Data Infrastructure Consulting

Scale-up: Scale-up technologies like the SAP HANA in-memory database are powered by a fewer number of servers with massive amounts of local RAM. These systems can handle both transaction processing as well as real-time analytics. The cost of these systems, as well as the ability to add capacity, should be considered with scale-up solutions. HP Vertica is both a scale-up and a scale-out solution.

Virtualization: Virtualizing servers is typically done for portability and ease of deployment. Because Big Data technologies make heavy use of server processing and disk I/O, virtualization is generally not seen as beneficial to Big Data technologies. However, if virtualization is part of a data center strategy or makes sense from the standpoint of a particular organization’s Big Data solution, then you should feel free to include it. In fact, even for an organization that deploys Big Data technology on physical servers, development and test systems are normally implemented on virtual machines, so virtualization will likely play a role in your Big Data solution.

NetworkA Big Data solution has the capability to saturate your network. An extreme example of this is where one communications service provider (CSP) receives and interprets 40 billion records a day and estimates that by the end of 2014 that number will be 100 billion records a day. IDC predicts that by 2020, digital data will grow by a factor of 300.3 So, while your organization may not be a CSP, your network will most likely need to handle much more traffic than it does today.

Big Data solutions collect, transmit, store, and process more data than you can handle today. The increase in data is resulting in higher data center interconnect traffic because more data is moved between storage systems, as well as between data centers. If the data processing platform is a large Hadoop or HP Vertica cluster, the data will be moved to a centralized location, which may also increase traffic. Results from the processing of this data may be pushed back to traditional analytics systems located in a different data center. Further, most enterprises will require some form of disaster recovery, which will create data replication from one data center to another. Gigabit per second bandwidth is required to meet these needs.

Further adding to network impact is the fact that Big Data traffic is normally machine to machine and is latency sensitive.

Even within a Hadoop cluster that spans multiple racks, the impact on the network is apparent. Traffic is generated during MapReduce processing as well as in support of the in-built replication and failover of Hadoop Distributed File System (HDFS). Isolated networks are required for the cluster, as well as network redundancy within the server, the rack, and between racks. Consideration for data ingress/egress points is a network factor as well.

Many Big Data technologies are being shipped in appliance-like bundles consisting of multiple servers/racks and software. While these bundled solutions may address the internal networking requirements of the solution, you still must consider how the bundle will be integrated into the corporate network. Further, network security, the increase in endpoints, and the move to IPv6, are also considerations for the network.

3 “IDC: The Digital Universe,” IDC, December 2012.

Page 8: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

8

Business white paper | HP Big Data Infrastructure Consulting

StorageStorage with a Big Data solution will likely incorporate multiple technologies. A Big Data solution is normally not a rip-and-replace solution. Newer Big Data–specific technologies like the HP Vertica Analytics Platform, Hadoop, and HP Autonomy software can sit side by side with existing data warehouse and analytics systems. Those existing systems may make use of SAN-based storage. Big Data–specific technologies can also make use of SAN-based storage and, as in the case of virtualization, SAN-based storage may make perfect sense for your specific Big Data needs. Other storage technologies such as HP Autonomy software may benefit from NAS-based storage such as HP StoreAll, where massive scalability and fast queries can be performed.

Direct attached storage (DAS) has become increasingly popular for Big Data solutions, especially with the increasing use of Hadoop. With Hadoop, redundancy is built in at the entire compute/DAS node level. If a disk or node fails, the workload fails over to another node. This redundancy is built in, and the level of redundancy is configurable. As mentioned earlier, another advantage is cost. Interestingly, HP is integrating its Vertica and Autonomy solutions with Hadoop to allow those technologies to consume data and process data held in Hadoop’s inexpensive storage environment. HP Vertica was the first analytics platform to offer a connector to HDFS, and now has two connectors for Hadoop, including one bidirectional pig tail connector. Other vendors, such as Microsoft, are doing the same.

ManagementAs with any other functional tenant of a data center, plans need to be in place to manage and monitor your Big Data solution. This can be expanded further into measuring service-level agreements (SLAs) and even addressing service management. For the scope of this document, we will discuss the following management topics: administration and operations, and security and compliance.

Administration and operations: The introduction of new Big Data technologies requires a new set of skills to administer those systems. People will need to be trained in the new technologies. Service operations guides, or runbooks, containing process and procedures will need to be created. Troubleshooting tips and escalation and support information will also need to be included. Automation scripts may also need to be developed.

Backup and recovery (BuR) is another aspect of operating a system that also needs to be addressed with Big Data technologies. One consideration is whether backup is appropriate for such huge datasets. In fact, it’s not the actual backup of the data that is the concern; it’s the restoration of that data. How long will it take to restore multiple petabytes of data? It may make more sense to replicate data to another data center, rather than perform backups, for disaster and recovery scenarios. Then again, some organizations are required to perform backups. Note that standard backup tools may not work out of the box with some technologies like Hadoop. Organization-specific strategies for backup may need to be created.

Page 9: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

9

Business white paper | HP Big Data Infrastructure Consulting

Security and compliance: You certainly have in place the ability to provide the appropriate levels of protection for your existing data, systems, and services. When you move to a Big Data solution, you need to review each of the areas, and make sure they are updated and still appropriate. For example:

• The Computer Security Incident Response Team (CSIRT) needs to understand Big Data, and how it should respond to security incidents in this new environment.

• Data privacy requirements may be very different. Even if your Big Data solution is based on existing data sources, the act of aggregating this data into a single repository where connections can be discovered may have significant data privacy implications.

• The security technology that you use for your existing data may not be appropriate for the velocity, voracity, variety, and volume of your Big Data. New tools and technology may be required for many different aspects of information security.

• eDiscovery requirements may require specialized tools and processes to enable you to interrogate your Big Data and respond correctly to litigation. eDiscovery must also be supported by your policies for retention, archival, and destruction of Big Data.

• Security controls include not just the technical controls such as encryption, but also administrative controls such as your security policy and physical controls such as the controlled offsite storage of tape media. All of these controls will need to be reviewed and updated to support the unique threats that apply to Big Data.

A strategy that covers both

It’s important to understand how and where you can gain value from Big Data and what the impact of Big Data technologies will have on your infrastructure. Your strategy needs to include input from people focused on your business as well as those with an IT focus. Strategy development can be initiated by either team, and HP has a portfolio of services that cover both.

With the “HP Big Data IT Transformation Experience Workshop,” we help you devise an IT strategy that improves Big Data services and value. We identify a specific roadmap and actionable steps based on the Big Data functionalities that are critical for you. The workshop encompasses security, management, operations, and standards models for Big Data, and delivers IT leadership while integrating the Big Data initiative with your business.

Conversely, if the Big Data conversation begins with the business, HP offers the HP Big Data Discovery Experience, a suite of services that can help your organization accelerate Big Data opportunities so that you can take advantage of actionable insights to drive new business innovation. These services are delivered via tested and proven processes, methodologies, reference architectures, and tools in a secure HP private cloud environment. Using the HP HAVEn Big Data platform, we deploy specific Big Data reference architectures that incorporate the HP Vertica Analytics Platform, HP Autonomy software, and Hadoop technologies.

This approach takes you from the high-level concepts of Big Data to the reality of harnessing information to create value. Ultimately, we help you combine the content from both HP services to provide your enterprise with a single Big Data strategy.

Page 10: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

HAVEn

Social media IT/OT ImagesAudioVideoTransactional

data Mobile Search engineEmail Texts

Catalog massivevolumes of

distributed data

Process and index all information

AutonomyIDOL

Analyze atextreme scale in

real time

Vertica

Collect & unifymachine data

Enterprisesecurity

PowerHP software+

your apps

nApps

Documents

Hadoop/HDFS

10

Business white paper | HP Big Data Infrastructure Consulting

HP HAVEn Platform: extracting value from Big Data

The marketplace for Big Data solutions incorporates numerous technologies. Each technology offers a specialized function or use case in which it works best. For example, Hadoop has emerged as a de facto technology and appears to be a common component in customers’ Big Data solutions. However, disparate approaches, tools, and frameworks can make it difficult for organizations to come up with the best direction to help ensure that they’re capturing and analyzing the right data. To address this issue, integrated comprehensive Big Data platforms encompassing hardware, software, and services are emerging. These platforms bring together everything a company needs to extract value from Big Data. The HP HAVEn Platform is such a platform that incorporates Hadoop, as well as HP-specific technologies. The HP HAVEn Platform is illustrated in figure 3.

Figure 3. HP HAVEn Platform overview

As figure 3 shows, HAVEn stands for Hadoop, Autonomy, Vertica, Enterprise Security, and any “n” number of applications (such as HP Operational Analytics). HP HAVEn is not a single product, but a platform that consists of multiple components. It’s an ecosystem consisting of products (hardware and software), partners, resellers, and services that surround the platform. However, in this section, we will focus on the HAVEn “engines,” Hadoop, HP Autonomy software, the HP Vertica Analytics Platform, and HP ArcSight Logger, and discuss each. It’s important to note that these components fit together like Lego building pieces to enable enterprises to process 100 percent of their data—structure, unstructured, and semi-structured.

Page 11: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

11

Business white paper | HP Big Data Infrastructure Consulting

HadoopHadoop has become a common platform in enterprise Big Data solutions, and there are a number of reasons for this. One reason is that Hadoop addresses gaps that exist in today’s business intelligence/analytics infrastructure. These gaps include capacity and scale, discarded and unused data, and archiving data.

Analysis: Human-generated data such as email, IM/TXT, documents, and audio formats are difficult to handle in traditional systems.

Capacity and scaleMany traditional systems are already overloaded or near capacity. The volume/velocity challenge associated with clickstream data, log files, and sensor-generated data are examples that add to the capacity of such systems, assuming that these data types can be handled in the first place. Additionally, scaling traditional systems is an expensive proposition, and at some point, it may become cost prohibitive to load all data and data types, even if a system has the capability to process it.

Discarded and unused dataDatabases in a traditional BI/analytics environment are shared among business units, and a schema is defined for those systems. Changing the schema to handle new data types can be a slow process. Technically, schema changes are not difficult; however, reaching agreement across business units and slotting the change into existing business processes can take weeks. Even with a fast schema change, source data can be dropped as it enters the BI/analytics system and thus not be part of the analytics process.

Page 12: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

12

Business white paper | HP Big Data Infrastructure Consulting

Archiving dataIn many enterprise organizations, data is archived, pushed to tape, and housed in a remote location. Many businesses are required to do so by regulatory compliance and corporate governance. However, once data has been archived, it is typically removed from production systems in an attempt to offload the resources required for the data. However, this process also eliminates the potential to easily examine that older data for trends or value.

Hadoop addresses all of these gaps, and more. It’s relatively easy to just dump data from any sources into Hadoop without having to define the schema up front. Once the data is in Hadoop, you can even use MapReduce or existing Hadoop-aware tools to perform extract, transform, and load (ETL)/extract, load, and transform (ELT), and load appropriate bits of data into the existing traditional BI systems. Even better, exploratory analytics can be performed against the data residing in Hadoop using MapReduce and, once value is seen, that data can be pushed to an application such as the HP Vertica Analytics Platform for stylized, interactive analytics. The remaining data that would have otherwise been discarded as part of ETL can also be processed via MapReduce. Further, old data residing in traditional BI/analytics systems can be pushed to Hadoop (instead of being archived) and can be further analyzed or, if necessary, (re)loaded into another BI system if needed.

Similarly, data doesn’t need to die a premature death in Hadoop as it does when it is archived to tape. Hadoop systems are relatively inexpensive and have extreme scale-out capability. It’s simple enough to just leave data in Hadoop and have it serve as a “data lake.” Once in the data lake, analytics can always be performed against it because Hadoop naturally combines storage and compute.

Finally, given the natural scale-out of both data and compute in a Hadoop cluster, common tasks that sometimes burden existing BI/analytics systems, such as reporting, are a natural fit for Hadoop (assuming the data exists in Hadoop). Customers and vendors alike are seeing vast improvements with Hadoop workloads for existing reports that take 20 minutes or more to generate.

Page 13: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

13

Business white paper | HP Big Data Infrastructure Consulting

HP Autonomy softwareAt the heart of the HP Autonomy infrastructure software lies the Intelligent Data Operating Layer (IDOL) Server, which contains mathematics and computer science breakthroughs protected by 170 patents. The IDOL Server collects indexed data from numerous sources through its more than 400 connectors and stores it in its proprietary structure, optimized for fast data processing and retrieval.

The IDOL Server automates the process of recognizing, categorizing, and retrieving concepts and meaning in unstructured human information, which falls into two categories:

• Unstructured text data: includes content in blogs, news feeds, documents, and social media interactions

• Unstructured rich media: includes photos, videos, sound fields, and forms of information that do not include text beyond simple metadata

The IDOL Server forms a conceptual and contextual understanding of all content in an enterprise, automatically analyzing any piece of information from over 1,000 different content formats. The IDOL Server can perform over 500 operations on digital content. These functions are available to build rich analytics applications for meaningful exploration of human data and are organized into four categories to help guide the application builder:

• Inquire: “Search your data” functions

• Investigate: “Analyze your data” functions

• Interact: “Personalize your data” functions

• Improve: “Enhance your data” functions

HP Autonomy software has a deep portfolio of applications designed for your specific data discovery needs. These include, for example, a Marketing Performance Suite of applications (for customer engagement and marketing optimization) and a Legal and Compliance Performance Suite (for litigation preparation and compliance).

Page 14: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

14

Business white paper | HP Big Data Infrastructure Consulting

HP Vertica Analytics PlatformThe HP Vertica Analytics Platform solves real-world Big Data challenges and is purpose built for organizations of all sizes to monetize data at hyperspeed and provide the massive scale needed to differentiate in today’s competitive economic climate. Compared to traditional databases and data warehouses, the HP Vertica Analytics Platform drives down the cost of capturing, storing, and analyzing data. And it produces answers 50 to 1,000 times faster to enable the iterative, conversational analytics approach needed.

The HP Vertica Analytics Platform offers these core features—all delivered at an overall lower total cost of ownership:

• Blazing fast analytics—gain insights into your data in near real time by running queries 50 to 1,000 times faster than legacy database and data warehouse solutions

• Massive scalability—infinitely and easily scale your solution by adding an unlimited number of industry-standard servers

• Open architecture—protect your hardware and software investment with built-in support for Hadoop, the R language, and leading BI/ETL tools

• Easy setup and administration—get your analytics initiatives to market quickly with a low cost of administration and maintenance

• Optimized data storage—benefit from a patented columnar compression that allows you to store 10 to 30 times more data per server than traditional databases

The HP Vertica Analytics Platform is truly built for analytics with technology born of the modern age. It is not a back-end legacy database, nor does it merely store your data. The HP Vertica Analytics Platform enables you to probe your data and ultimately find the answers you need to monetize Big Data.

Some of the benefits of the HP Vertica Analytics Platform include that it:

• Compresses data to reduce storage costs and speeds access by up to 90 percent

• Stores data by columns rather than rows and caches data in memory to make analytic queries 50 to 1,000 times faster

• Features massively parallel processing (MPP) to spread huge data volumes over any hardware, including low-cost commodity servers

• Uses data replication, failover, and recovery to achieve automatic high availability

• Includes a pre-packaged in-database analytics library to handle complex analytics as well as a robust development framework and support for the R statistical programming language to enable analysts to create user-defined analytics inside the database

• Dynamically integrates with and complements Hadoop to move and analyze large sets of structured, semi-structured, and unstructured data back and forth between systems for data exploration and fast data analytics

• Integration with Hadoop and simplified Amazon EC2 cloud deployments

Page 15: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

15

Business white paper | HP Big Data Infrastructure Consulting

HP ArcSight LoggerThe HP HAVEn Platform brings an entirely new level of enterprise security, allowing you to see not just if a breach will occur but also when it is likely to occur. The HP HAVEn Platform enables you to unify data in various formats from various sources into a simple common format, allowing members of your team to:

• Search for compliance information and create reports, charts, and dashboards

• Perform quick forensic investigations

• Search through millions of events in seconds to quickly troubleshoot security concerns

HP ArcSight Logger is an integral part of the HP HAVEn Platform that unifies searching, reporting, alerting, and analysis across any type of enterprise log and machine data. It is unique in its ability to collect, analyze, and store massive amounts of machine data generated by modern networks. HP ArcSight Logger supports multiple deployments such as an appliance, software, virtual machine, and within the cloud in both Windows and Linux environments.

With ArcSight Logger, you can:

• Collect: You can gather any data from any device in any format from over 300 distinct log-generating sources.

• Enrich: While the data is being collected, you can filter and parse it with rich metadata, which helps to unify the machine data.

• Search: As the machine data is enriched during collection, you can search millions of events using text-based keywords, with no need for obscure commands or domain expertise.

• Store: The unified data can be stored in any format you have through a high compression ratio of up to 10:1, eliminating the need for additional database administrators.

• Analyze anything: The rich content built into ArcSight Logger helps you perform complex searches and create comprehensive drill-down reports. In addition, you can rely on real-time alerts to use machine data for IT security, governance, risk and compliance (GRC); IT operations; security information and event management (SIEM) solutions; and log analytics.

Page 16: Unlock Big Data - H3C · Business white paper | HP Big Data Infrastructure Consulting Table of contents 3 Executive summary 3 The importance of Big Data 4 Big Data and the potential

Rate this documentShare with colleagues

Sign up for updates hp.com/go/getupdated

Business white paper | HP Big Data Infrastructure Consulting

Conclusion

Big Data is a major business and IT initiative that, when harnessed, can provide tremendous opportunities for both private and public sector organizations. Unlocking the potential of Big Data requires careful planning and a well-crafted strategy that must include what you want to achieve with analytics and the type of infrastructure needed to support all your Big Data efforts. HP has a rich portfolio of services and solutions, and the HP HAVEn Platform can help you unlock the potential of Big Data. Our HP Big Data Infrastructure Consulting Service demonstrates how you can reduce implementation and integration risk, ramp up your Big Data skills, and accelerate adoption and time to value as you work toward achieving your business objectives.

HP Technology Services Consulting—a perfect fit for Big Data

We help make IT departments relevant to business by giving them the infrastructure to extract more value from Big Data. HP Technology Services comprises a hub of highly qualified and experienced consultants who can bring a comprehensive transformation to a converged infrastructure for Big Data. The service offerings integrate all HP Big Data solutions from strategy, design, and implementation, to protection and compliance.

HP is uniquely qualified to help your IT infrastructure get the most value from all your data. Our consultant teams include experts who can help you craft a comprehensive Big Data strategy. And our field-proven, integrated approach is the perfect answer to transforming your IT infrastructure for Big Data, from servers and clusters to storage and networking. Our security expertise is second to none.

HP consultants can take the lead or offer coaching, doing whatever it takes to make your Big Data initiative a success. That might mean conducting assessments and stakeholder interviews, or holding facilitated workshops to uncover your vision, goals, needs, and requirements. Or it might mean leveraging best practices, including field-proven HP reference architectures and experience. We can make design and configuration recommendations, identify risks, and propose a roadmap that helps you reach your goal.

With the expertise of our worldwide army of consultants on tap, you can count on a comprehensive and consistent Big Data implementation—on any scale.

Learn more athp.com/services/BigData

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

Google is a trademark of Google Inc. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation.

4AA4-9445ENW, December 2013