big data …big opportunities ? ……big hype ? (or just a big mess ?) data challenges and ibm...

43
Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views Dr. Matthew Ganis IBM Senior Technical Staff Member CIO Social Media Analytics Chief Architect Member, IBM Academy of Technology [email protected] @mattganis (twitter)

Upload: sol

Post on 05-Jan-2016

35 views

Category:

Documents


3 download

DESCRIPTION

Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views. Dr. Matthew Ganis IBM Senior Technical Staff Member CIO Social Media Analytics Chief Architect Member, IBM Academy of Technology [email protected] @mattganis (twitter). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Big Data…Big Opportunities ?……Big Hype ?(or just a Big Mess ?)

Data challenges and IBM views

Dr. Matthew GanisIBM Senior Technical Staff Member

CIO Social Media Analytics Chief ArchitectMember, IBM Academy of Technology

[email protected]@mattganis (twitter)

Page 2: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

The Term “Big Data” is pervasive - but still provokes a bit of confusion.

SO what is it ?

Big Data has been used to convey all sorts of concepts, including huge Quantities of data, social media analytics, next generation data managementCapabilities, real time data and much much more.....

Page 3: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

That means we create about 1.8 Zetabytes of Information everytwo years.

Page 4: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Extracting insight from an immense volume, variety and velocity of data, in context, beyond what was previously possible.

Page 5: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

2009

800,000 petabytes

2020

35 zettabytesas much Data and ContentOver Coming Decade

44xBusiness leaders frequently make decisions based on information they don’t trust, or don’t have1 in 3

83%of CIOs cited “Business intelligence and analytics” as part of their visionary plansto enhance competitiveness

Business leaders say they don’t have access to the information they need to do their jobs

1 in 2

of CEOs need to do a better job capturing and understanding information rapidly in order to make swift business decisions

60%

… And Organizations Need Deeper Insights

Of world’s datais unstructured

80%

Information is at the Center of a New Wave of Opportunity…

5

Page 6: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Structured data refers to information with a high degree of organization, such that inclusion in a relational database is seamless and readily searchable by simple, straightforward search engine algorithms or other search operations; whereas unstructured data is essentially the opposite.

The lack of structure makes compilation a time and energy-consuming task.

Structured vs Unstructured

Page 7: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

The Challenge: Bring Together a Large Volume and Variety of Data to Find New Insights

Identify criminals and threats from disparate video, audio, and data feeds

Make risk decisions based on real-time transactional data

Predict weather patterns to plan optimal wind turbine usage, and optimize capital expenditure on asset placement

Detect life-threatening conditions at hospitals in time to intervene

Multi-channel customer sentiment and experience a analysis

7

Page 8: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Where we want to go

Page 9: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Merging the Traditional and Big Data Approaches

IT

Structures the data to answer that question

IT

Delivers a platform to enable creative discovery

Business Users

Explores what questions could be asked

Business Users

Determine what question to ask

Monthly sales reportsProfitability analysisCustomer surveys

Brand sentimentProduct strategyMaximum asset utilization

Big Data ApproachIterative & Exploratory Analysis

Traditional ApproachStructured & Repeatable Analysis

9

Structured vs. Exploratory

Page 10: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Where is all this data coming from ?

Page 11: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Where is all this data coming from ?

Page 12: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

The Internet of Things (IoT) is a scenario in which objects, animals or people are provided with unique identifies and the ability to automatically transfer data over a network without requiring human-to-human or human-to-computer interaction

Page 13: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Where is all this data coming from ?

Page 14: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views
Page 15: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Approximately 2.7 billion userson the Internet today

Page 16: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Social Media as Big Data

Page 17: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views
Page 18: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

What are we running ?

Who is talking about us ?Male / Female / Student / Professional / Retired / Customers ?

What do they “feel” ?Positive/Negative Sentiment / Angry / Annoyed ?

Where are they talking ?

Who are they influencing ?Who’s listening to them ?

Page 19: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views
Page 20: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views
Page 21: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views
Page 22: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views
Page 23: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

When customers are talking about us or about our products we want to know where those conversations are happening so we can:

•Interact with interested customers•Get in front of any issues

Page 24: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views
Page 25: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views
Page 26: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views
Page 27: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Numerous studies show that word-of-mouth and personal recommendations are seen as far more credible to consumers than newspaper and television advertisements. While such mass advertisements are still necessary because of their powerful reach, these findings show that companies need to increase their focus on more personalized approaches. Clearly, this is incredibly difficult, maybe even impossible, for most companies to deal directly with the countless number of potential consumers. This is where influencers come in……

Page 28: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

What makes someone Influential ?

The number of tweets they make ? The number of times people mention them ?

The number of followers they have?How often they are retweeted ?

Page 29: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views
Page 30: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

We were asked to look at why a particular product launch wasn’t performing as expected. We pulled all the “chatter” about it and found:

Page 31: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

But there were people talking about it…..

Page 32: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Some things to think about…..

Page 33: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Where is all this data coming from ?

While it is true that vast amounts of data are and will be generated from financial transactions, medical records, mobile phones and social media to the Internet of Things but there are questions that need to be asked to understand data’s meaningful use:

• How will data be managed?• How will data be shared?

Some thoughts about “data as a service”

•Establishment of standards, governance, guidelines. (E.g., open architectures)•Creation of industry specific data exchanges. (E.g., healthcare data exchanges, environment data exchanges etc.)•Creation of cross-industry data exchanges. (E.g., healthcare data exchanges seamlessly interacting with environmental data exchanges etc.)

Page 34: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Enterprise Integration

Trusted Information & Governance

– Companies need to govern what comes in, and the insights that come out

Data Management– Insights from Big Data must

be incorporated into the warehouse

Big Data PlatformData Warehouse

Enterprise Integration

Traditional Sources New Sources

34

Page 35: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Poor data quality

Dirty dataMissing valuesInadequate data sizePoor representation in data sampling

Page 36: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Data variety - trying to accommodate data that comes from different sources and in a variety of different forms (images, geo data, text, social, numeric, etc.).

How do we link them together ?Is there a common taxonomy or why to organize it ?Is there a “signal” in one source of data that points to another ?

Page 37: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Dealing with huge datasets, or 'Big Data,' that require distributed approaches.

Page 38: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Who is influential ?

How do we define influence ?

Page 39: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

39

Thank you for your attention

Page 40: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Where is all this data coming from ?

Page 41: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Extracting insight from an immense volume, variety and velocity of data, in context, beyond what was previously possible.

The Big Data Opportunity

Manage the complexity of multiple relational and non-relational data types and schemas

Streaming data and large volume data movement

Scale from terabytes to zettabytes (1B TBs)

Variety:

Velocity:

Volume:

41

Page 42: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

Big Data : why is it possible Now ?

Traditional approach : Data to Function

Big Data approach : Function to Data

Database server

Data

Query Data

return Data

process Data

Master node

Data nodes

Data

Application server

User request

Send result

User request

Send Function to process on Data

Query & process Data

Data nodes

Data

Data nodes

Data

Data nodes

DataSend Consolidate result

Traditional approachApplication server and Database server are separateData can be on multiple serversAnalysis Program can run on multiple Application serversNetwork is still a the middleData have to go through the network

•Big Data Approach Analysis Program runs where are the data : on Data NodeOnly the Analysis Program are have to go through the networkAnalysis Program need to be MapReduce awareHighly Scalable :

1000s NodesPetabytes and more

42

Page 43: Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views

What Big Data Is Not

It is not a replacement for your Database strategy

It is not a replacement for your Warehouse strategy

It is not a solution by itself, it needs jobs/applications to drive value

43