optimizing business by unleashing big data in the...

24
Optimizing Business by Unleashing Big Data in the Enterprise (Technion Computer Engineering Conference, May 2013) Aya Soffer Director, Big Data Analytics IBM Research – Haifa

Upload: others

Post on 20-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

Optimizing Business by Unleashing Big

Data in the Enterprise (Technion Computer Engineering Conference, May 2013)

Aya Soffer Director, Big Data Analytics IBM Research – Haifa

Page 2: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

Agenda

• Big data in the Enterprise, why now

• Big Data Research in IBM

• “Deep Dive” – Customer Analyst

Page 3: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

“A report by the World

Economic Forum in

Davos, Switzerland,

declared data a new

class of economic asset,

like currency or gold.

“Companies are being

inundated with data—

from information on

customer-buying habits

to supply-chain

efficiency. But many

managers struggle to

make sense of the

numbers.”

“Data is the new oil.”

Clive Humby

Big Data in the Press and on Business Leaders Minds

Big Data Is Hollywood's

New Rising Star -

Hollywood has discovered

Big Data’s talents to

determine how to

distribute and promote

movies

1 in 3 Business leaders make critical decisions without the information they need

53% Business leaders say they don’t have access to the information they need to do their jobs

From the Press From Our Surveys

2.2X Organizations leveraging analytics more likely to outperform their industry peers

Page 4: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

Big Data is Big, Fast, and Diverse

Cost efficiently processing the

growing Volume

50x

35 ZB

2020 2010

Responding to the

increasing Velocity

30 Billion RFID

sensors and counting

Collectively analyzing

the broadening Variety

80% of the worlds data

is unstructured

In Order to Realize New Opportunities, Companies are Thinking Beyond Traditional Sources of Data

Transactional and Application Data

Machine Data Social Data Enterprise

Content

Volume

Structured

Throughput

Velocity

Semi-structured

Ingestion

Variety

Highly unstructured

Veracity

Variety

Highly unstructured

Volume

Page 5: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

Big Data Success Stories are emerging – Combining Organizational Data with Public sources

Retailer reduces time to run queries by 80% to optimize

inventory

Stock Exchange cuts queries from 26 hours to 2

minutes on 2 PB

Government cuts security analysis from hours to

70 Milliseconds

Utility avoids power failures by analyzing

10 PB of data in minutes

Telco analyses streaming network data to reduce hardware costs by 90%

Hospital analyses streaming vitals to detect illness

24 hours earlier

Page 6: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2011 IBM Corporation

An Example - Retail Data Asset Landscape

Finance Inventory Suppliers Shipments Orders

Sales Stores Products Employees Customers

eCommerce Marketing Social Mobile Third Party

Video

Page 7: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2011 IBM Corporation

Technology like Hadoop and commercial Big Data platforms make it possible to cost-effectively analyze all available data

Visualize and Experiment

Predict

Integrate and Govern

Hadoop

System

Stream

Computing

Data

Warehouse

Analyze Real-time

Search and Discover

From http://www.ebizq.net/blogs/enterprise/

IBM Big Data Platform

Log analytics and event monitoring Enterprise knowledge management Contact centers Customer acquisition and retention Digital Marketing Effectiveness Decision Analytics and Operations

Page 8: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

Agenda

• Big data in the Enterprise, why now

• Big Data Research in IBM

• “Deep Dive” – Customer Analyst

Page 9: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

Big Data Research in IBM (Not an exhaustive list)

• New varieties of data

– Text / Social Media

– Networks

– Multimedia

– Machine Data / Sensors

• Visual Analytics

- Navigating through data

- Visualizing and interacting with analytics

• Big Data Performance

– In memory

– HW acceleration (FPGA)

– Benchmarks

– New Architectures

• Information Integration

– Integrating Enterprise and public data

– Linking data / context

– Entity Extraction and integration

• Industry Applications

- Healthcare

- Telco

- Retail & Marketing

- Smarter Workforce

- Energy

- Water / Agriculture

- Public Safety

- ….. - Blue – work in Haifa

Page 10: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2011 IBM Corporation 10

Building an Environment for Analyzing Data

We are creating a “plug-and-play” environment for exploring massive data

– A collaborative and exploration-focused user experience

– Rich collection of analytics and tools for analysis

– Powerful infrastructure for data management and analytics

– Pre-integrated data sets to provide context

– Expertise in all aspects of the process

Lets the domain expert focus on their strengths; we handle the data challenges

Big Data

Analysis

Traditional

Data

Analysis

Application Layer: Models, Analytics, Applications

User Services: Visualization, Reporting, Collaboration

Data

Preparation

& Ingestion

Data and Analytic Services & Tools: Libraries, Catalogs

Data

Management

Systems Infrastructure Domain Informatics Researchers

Human-Computer Interaction

Analytics and Mathematical Sci.

Data scientists

Information Mgmt Researchers

Information retrieval Researchers

Computer systems Researchers

IT operations support

Customer Care

Telco Monitization

Personalized Medicine

Advanced Discovery Lab

Data Sets

Other Projects

Page 11: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2011 IBM Corporation

Smarter Workforce Analytics

Expertise Location

Expertise Building

Engaging Experts

Social Pulse Expertise

Predictive and Social Analytics for

identifying employee retention

propensity

Retention

Social Pulse

Derive insight for employee’s sentiment

from social media, to refine policies,

focus communications & drive culture

Customers

Partners

Employees

Smarter Workforce Analytics applications leverage the Enterprise Social Graph which

combines transactional, social and business data, to perform analysis such as:

influence, social proximity, reputation, impact, expertise and more

Page 12: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

Agenda

• Big data in the Enterprise, why now

• Big Data Research in IBM

• “Deep Dive” – Customer Analyst

Page 13: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2011 IBM Corporation

Customer Analyst – analyzes customer behavior and digital traces to build rich customer profiles

Input:

– Documents people read

– Things people write and respond

to in social media

– Searches people do on the web

– Transactional data

– Organizational data (e.g product

catalogs, demographics)

Analysis:

– Evolving personal interests and

preferences

– Life events

– Topical Influence

– Local/Global communities to

which users belong

Applications:

– Customer segmentation

– Marketing promotions and advertisement

– Products recommendations

– Churn prediction

– Demand prediction

– ….

13

Page 14: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2009 IBM Corporation

Customer Analyst High Level View

Infrastructure IBM BigInsights IBM Streams

Fusion

Telco

Communities Influencers

Hadoop

SDA

BoardReader

Data Layer

Accelerators Layer

BigIndex

Data Collection GNIP T4J SFC SyndicationHub ETL

Wiki

Categorizer

Content Analytics & Discovery Layer

Social Analytics Layer

User Profiler

Data Parsers

Metrics

Life

Event

Detection Sentiments

Personalization Layer

URL Analyzer

Targeting Recommendation

Industry Solutions Layer

Retail CP HC Marketing

Wikipedia

Index

ODP

Index Blogs Facebook Twitter Browsing Mobile Transactions

FB API

SMA

14

Page 15: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2009 IBM Corporation

Detecting Interests and Taste based on Mobile Data Usage

URL/App Analysis: for each user, report the

most meaningful interests

to describe her profile.

Large scale analysis to to map pages to a clear

and well defined taxonomy

Update users

profiles Consume

Browsing activity on mobile devices

Data Cleansing

Userid Category Strength

012013a474 Sports/Football 22

012013a474 Shopping/Vehicles 15

012013a474 Sports/Swim 14

15

Microsegmentation

Tiered pricing plans

Promotions

Churn

Mobile

Gateway

Logs

Page 16: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2009 IBM Corporation

URLs are transformed into concepts

{docid: d1, wwpokec.azet.sk}

{docid:d2, http://news.yahoo.com/recall-news-215006441.htm}

Concepts (categories) Selection

{docid: d3, www.youtube.com}

ODP-

Business/Marketing_and_Advertising/News_and_Media

Concepts Aggregation

(Top-k concepts per user)

WIKIPEDIA

Product recalls

URL Parsing (Types)

Userid Category Strength

012013a474 Sports/Football 22

012013a474 Shopping/Vehicles 15

012013a474 Sports/Swim 14

16

Page 17: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2009 IBM Corporation

Demographic Analysis Example Top Level browsing behaviour does not vary widely

by age group

25-34 year olds concentrate a higher proportion of

their browsing in the “top categories”

Male Female

News & Media Online Shopping

Sports Health & Medicine

Football Cinemas

Autotrader Personal Finance

Adult Content

Mobile Gaming

Analysing only the top 100

browsing categories it is

possible to identify clear

preferences by Male and

Female customers

Top ten categories remain

the same for Men and

Women, though the

ordering varies slightly

Those categories for which

there are significant

differences between men

and women:

17

Page 18: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2009 IBM Corporation

There is a correlation between browsing diversity and churn propensity in Prepay customers Each MSISDN in Consumer PrePay has been allocated a Churn Percentile score

Comparing each percentile group’s top categories shows that Churn Percentile

seems to be correlated positively with increasing variety of categories browsed

Further findings indicated a higher propensity to churn for heavy users of social media

sites and for soccer fans ( the reason: a competitor proposing SMS updates with

match scores)

0

0.05

0.1

0.15

0.2

0.25

0.3

90-100th 80-90th 70-80th 60-70th 50-60th 40-50th 30-40th 20-30th 10-20th 0-10th

Churn Propensity Percentile

Bro

wsin

g D

ivers

ity In

dex

18

Page 19: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2009 IBM Corporation

Example: Localized community analysis for marketing

Understanding the marketing potential of particular locations

Understanding the potential of viral marketing

Identifying promising community types and targeting marketing to them

Lowering marketing costs by targeting earned media

Extended community

of people that talk about some subject

19

Page 20: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2009 IBM Corporation

Location 1 Location 2 Location 3

Geographical Analytics – How it works

• GPS Geotagging (<5% of tweets)

• Even if explicit in profile – disambiguation might be needed:

– E.g., “Springfield” by itself can refer to 30 different cities in the USA.

• Techniques used – Rule-based

• E.g., “I live in ..”, “lets meet at ..”

– Machine learning (supervised): • Statistical methods- find the most

characteristic terms of people that report they live in some location.

• E.g., “The Strip”, “Bellagio fountains”, “Freemont St.”…-> Las Vegas

– Based on Social Network, • i.e. learn location of people

based on the locations of their friends

20

Page 21: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2009 IBM Corporation

How we build the communities:

–Build social graph based on the data flow in the social media. For

example, in Twitter, using the @Reply tag.

–Extend the connections with friends, followers, following, etc.

–Then use clustering-based approach

Features of a community: – Content (profile + messages of participants): what do participants talk about?

which topics? how much?

– Topological (structure, level of activity): how is the community organized?

how fast do messages spread? how many are people connected?

– Role of participants (structure, level of activity): are there community

leaders? influencers? advocates? connectors to other communities?

– Type (e.g. religious congregation, school, teen friends, reading club, yoga):

what is the type of the community? is it possible to market to all communities

of the same type?

– Dynamics: what makes communities grow/shrink? how to influence a

community? Which features have commercial significance? Which features

can be acted upon?

Community Analytics - How it works:

21

Page 22: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

Using same technology with events vs. people - managing Natural Disasters

Event 1 – 10:10 river water surging (from

accumulation of tweets)

Event 2 – 11:15 fast moving water

(from accumulation of mobile

messages) Event 3 – 11:15 – flood, major road blocked

(from accumulation of tweets and mobile

messages)

Event 4 – 12:30 – flood (from

accumulation of tweets and mobile

messages) Event 5 – 12:30 – traffic accident (from

accumulation of mobile messages)

Page 23: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

23

Summary

• The wealth of available data affords many opportunities

– To do better science, improve our world, make more money

• Getting insight (let alone foresight) from data is still too hard

– Must handle the 4V’s of data

– Requires multiple skills – data science, systems, computer science, math

– Requires data and tooling

– Requires significant computing resources

• There is a lot of exciting research being done in this space

– User experience, Data and semantics, Analytics and modeling, Application of Big Data, Systems research

• More information see recently released IBM Journal of Research and Development, Issue 3 / 4, May – July 2013, Massive-Scale Analytics, Guest Editor Aya Soffer

– http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=6517282

Page 24: Optimizing Business by Unleashing Big Data in the Enterprisetce.webee.eedev.technion.ac.il/wp-content/uploads/sites/8/2015/04/Aya... · Optimizing Business by Unleashing Big Data

© 2011 IBM Corporation

IBM Research

24

תודהHebrew (Toda)

Thank You

Merci Grazie

Gracias

Obrigado

Danke

Japanese

English

French

Russian

German

Italian

Spanish

Portuguese

Arabic

Traditional Chinese

Simplified Chinese

Thai

Korean

KIITOS Danish