big data analytics - dell emc audio text video ... big data analytics provides potential for more...

13
1 © Copyright 2011 EMC Corporation. All rights reserved. Big Data Analytics

Upload: lekiet

Post on 30-Jun-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

1 © Copyright 2011 EMC Corporation. All rights reserved.

Big Data Analytics

2 © Copyright 2011 EMC Corporation. All rights reserved.

Priority Discussion Topics

• What are the most compelling business drivers behind big data analytics?

• Do you have or expect to have data scientists on your staff, and what will be their charter?

• What are the different product, technology and architectural components that need to be considered?

• What process challenges for data collection, data cleansing and data quality concern you most?

3 © Copyright 2011 EMC Corporation. All rights reserved.

It’s a Whole New Big Data World …

4 © Copyright 2011 EMC Corporation. All rights reserved.

• Volume: data volumes approaching multiple petabytes • Velocity: data being generated and ingested for analysis in real-time • Variety: tabular, documents, e-mail, metering, network, video, image, audio • Complexity: different standards, domain rules, and storage formats per data type

More than just data volume, big data analytics must also consider data velocity, variety, and complexity

Transactional Data Documents Smart Grid

Variety Complexity

Velocity Volume

Source: Gartner, March 2011

New insights on customers, products, and operations

Contextual and location-aware delivery to any device

Images Audio Video Text

5 © Copyright 2011 EMC Corporation. All rights reserved.

“Over the last 25 years, companies have been focused on leveraging maybe 5% of the information available to them… In order to compete well, companies are looking to dip into the rest of the 95% that can make them better than anyone else.”

Big data analytics provides potential for more timely, complete, actionable business insights

Source: Forrester Research Inc.

Less than 10% of available enterprise data

Vast majority of available data, including external sources

“Rearview mirror” reports, dashboards, and analysis

“Forward looking” predictions with recommendations

Weeks, months, or even quarters old Real-time or near real-time

Incomplete, inaccurate, and disjointed data

Correlated, high confidence, governed data

Architectures and methods that take 6 to 18 months to exploit

Vastly accelerated time to market

Today’s Situation Big Data Analytics Ramifications

6 © Copyright 2011 EMC Corporation. All rights reserved.

What are the most compelling business drivers behind big data analytics (i.e., what gets your business stakeholders excited)?

7 © Copyright 2011 EMC Corporation. All rights reserved.

Do you have or expect to have data scientists on your staff? Will they be in the business or in IT? What will be their charter? How will you measure their effectiveness?

8 © Copyright 2011 EMC Corporation. All rights reserved.

Successful organizations continuously uncover and publish new insights about the business

1

2

5 Strategic Business Initiative

3 4

2) IT Acquires and integrates data

3) Data Scientists

Builds and refines analytic models

4) IT Publishes new insights

5) Business

Consumes insights and measures effectiveness

1) Business Defines mandate and requirements

Data scientist (GigaOM)

Obtain, scrub, explore, model ,and interpret data, blending hacking, statistics, and machine learning, with good understanding of the business processes and goals

9 © Copyright 2011 EMC Corporation. All rights reserved.

What are the different product, technology, and architectural components that need to be considered in a big data analytics project?

10 © Copyright 2011 EMC Corporation. All rights reserved.

Data Input Integration Data Stores and Access Data Analysis

Presentation & Delivery

Multimedia

Web/Social

ERP

CRM

POS

Data Sources

Mobile

Documents

Machine Data Quality

MDM

ETL

Enterprise

Data

Warehouse

BU 1

BU 2

BU 3

Dat

a M

arts

Map

- Re

duce

Key Values Documents Other NoSql

Ecosystem* HDFS

Hadoop

NoSQL Stores

Federated

Data

Warehouse

Map- Reduce

BI as a

Service

Statistics D

ata Mining

Operations Research

Neural N

ets G

enetic Algorithms

OLAP

Alerts

Reports

Dashboards

Spreadsheets

*Hadoop Ecosystem includes: Hive, Pig, Mahout, HBase, ZooKeeper, Oozie, Sqoop, Avro

Structured data sources

Traditional data Integration

Traditional data warehousing

Big data analytics ramifications

SQL Stores

LOB data

EMC Big Data Analytics Reference Architecture

Mobile

Data Visualization

11 © Copyright 2011 EMC Corporation. All rights reserved.

What process challenges for data collection, data cleansing, and data quality concern you most with respect to big data and advanced analytics?

12 © Copyright 2011 EMC Corporation. All rights reserved.

EMC IT use case of performance and security event management Data Volume, Velocity, Variety AND Complexity

Challenges • High volume of event data

• Numerous data types across thousands of collection points

– 12 MB/collection point per hour

• Information silo’ed and difficult to aggregate and correlate

• Manually-intensive ad-hoc analytics

Approach • Created fast aggregation capabilities with

Hadoop and a single data framework with the Greenplum database

• Mapped GRC model to control management layer

• Leveraged modern, integrated and interrelated analytic tools for correlation of events

• Implemented real-time data loading and analysis at high frequency

Benefits

Framework for single management of

controls

Faster investigation of incidents

Automated and aggregated analysis

Security embedded in virtual

infrastructure

13 © Copyright 2011 EMC Corporation. All rights reserved.

THANK YOU