interesting ways big data is used today

48
Interesting ways Big Data is used today Daniel Sarbe May 2015, Big Data Romanian Tour - Timisoara

Upload: daniel-sarbe

Post on 15-Aug-2015

253 views

Category:

Software


2 download

TRANSCRIPT

Page 1: Interesting ways Big Data is used today

Interesting ways Big Data is used today

Daniel Sarbe May 2015, Big Data Romanian Tour - Timisoara

Page 2: Interesting ways Big Data is used today

Agenda

1. Source of (Big)Data2. Why now?3. Interesting patterns of using BigData4. BigData – Big Opportunities

Page 3: Interesting ways Big Data is used today

“There is a big data revolution.

But it is not the quantity of data that is revolutionary.The big data revolution is that now we can do something with the data.”

Gary King, professor at Harvard University

Page 4: Interesting ways Big Data is used today

“In God we Trust, all others bring data”William Edwards Deming - American statistician

“If we have data, let’s look at data. If all we have are opinions, let’s go with mine.”

Jim Barksdale, former Netscape CEO

Page 5: Interesting ways Big Data is used today

Source of (Big) Data

Page 6: Interesting ways Big Data is used today

The 3+1Vs of Big Data

Page 7: Interesting ways Big Data is used today
Page 8: Interesting ways Big Data is used today

Just how Big is the Big Data market?

Big Data Market Forecast, 2011-2020(in $ billion)

Source: Wikibon 2015

Page 9: Interesting ways Big Data is used today

Interest in BigData

Page 10: Interesting ways Big Data is used today

Why now?

1) So much data generated that we cannot store and analyze with conventional tools

Page 11: Interesting ways Big Data is used today

Big Data in Aviation Industry

Page 12: Interesting ways Big Data is used today

2) Companies realized the potential and started to invest money

Page 13: Interesting ways Big Data is used today
Page 14: Interesting ways Big Data is used today

Companies sources of data analysis

Source: captricity.com

Page 15: Interesting ways Big Data is used today

Data Mining & Machine Learning

• Data Mining - The process of discovering meaningful correlations, patterns and trends by sifting through large amounts of data

• Machine Learning is the study of computer algorithms that improve automatically through experience▫ Supervised machine learning -  The program is “trained” on a pre-defined set

of “training examples”, which then facilitate its ability to reach an accurate conclusion when given new data.

▫ Unsupervised machine learning - The program is given a bunch of data and must find patterns and relationships therein.

Page 16: Interesting ways Big Data is used today

BigData use-cases

Source: IBM

Page 17: Interesting ways Big Data is used today

Analytics Maturity

Page 18: Interesting ways Big Data is used today

Predictive Analysis

• Predictive analytics is an area of data mining that deals with extracting information from data and using it to predict trends and behavior patterns. 

• The accuracy and usability of results will depend greatly on the level of data analysis and the quality of assumptions

Page 19: Interesting ways Big Data is used today

BigData used for predictions – 2012 US Election

The 2012 Election: A Big Win for Big Data• Statistician Nate Silver, gave Barack Obama

over a 90 percent chance of victory in the Electoral College. 

• Algorithm 538 name - number of electors in US

• In 2008 his mathematical model correctly called 49 out of 50 states, missing only Indiana (which went to Obama by 0.1%.) (John McCain vs Barack Obama)

• In 2012 Silver's model has correctly predicted 50 out of 50 states. 

• Incorporated hundreds of state-level polls into his analysis. Economic variables, demographics, electoral outcome, historical polls, economic data and party registration figures were also incorporated

• While some analysts might cherry-pick data

sources according to whether they were qualitatively "reliable" or "unbiased", Silver incorporated them all. Silver's model instead looked at trends over time

Page 20: Interesting ways Big Data is used today

BigData used for predictions - 2014 Sochi Winter Olympics

•“Canada will enjoy their best Olympics ever, while the U.S. and host Russia will struggle." 

Page 21: Interesting ways Big Data is used today

BigData used for predictions - 2014 Sochi Winter Olympics

• The analysts used publicly available data on all Winter Olympic Games from 1924 forward

• The model's inputs are Gross Domestic Product(GDP), year, if the country is communist or not, if the country is a host or not, population of that country, and its historical performances and medal counts in previous Olympics.

• All variables are given the same weight in the model• The medal count prediction is based on a linear regression model• The algorithm is based on historical data, and doesn’t necessarily reflect more

current information such as emerging stars, recent funding boosts, and an unexpectedly large addition of new events to the program.

• “Based on the above mentioned data and analysis, the analysts predict that Canadian athletes will grab the most medals and the United States will finish seventh. Germany, Norway, Austria, China and Russia will rank second to sixth respectively.”

Page 22: Interesting ways Big Data is used today
Page 23: Interesting ways Big Data is used today

Big Data used in other sports

Germany Uses Big Data to Crush Brazil in World Cup Semifinal• Forget about Moneyball - Germany has now used serious Big Data to win a World Cup

match. • Soccer, a more fluid game, was thought to be less amenable than baseball to Big

Data's wiles.• According to assistant coach Hansi Flick, team managers combed through years of

research about the Brazilian team compiled by students at Cologne's Sports University, looking for any advantage to be gained over the Brazilian team.

• The compiled information included a detailed analysis of all Brazil's players--their favorite moves, how they deal with high pressure scenarios, their reactions when fouled, and even how they sprint for the ball.

Page 24: Interesting ways Big Data is used today

•3) Cost of cloud/hardware and full-grown of software solutions (Hadoop ecosystem)

Page 25: Interesting ways Big Data is used today

Cost per GB of disk

Page 26: Interesting ways Big Data is used today

Hadoop – The platform for BigData

• Hadoop became a very stable and mature platform (and faster)

Page 27: Interesting ways Big Data is used today

Hadoop 1.0 to 2.0

Page 28: Interesting ways Big Data is used today

Hadoop myths debunked

Hadoop isn’t enterprise ready

Hadoop isn’t stable, cluster go down

You lose data on HDFSData cannot be shared across

the organizationHadoop is not secured

NameNode do not scaleSoftware upgrades are rareHadoop use cases are limitedI need expensive servers to

get moreHadoop is so dead

Source: Sumeet Singh - Yahoo

Page 29: Interesting ways Big Data is used today

Cost of BigData vs Traditional DBs

Page 30: Interesting ways Big Data is used today

Hadoop Providers

◦ 1. Cloudera - $4B market value

◦ - 1,000+ paying customers

◦ 2. Hortonworks - $1B market value - 800+ paying customers

◦◦ 3. MapR - $1B

market value◦ - 700+ paying

customers

Page 31: Interesting ways Big Data is used today
Page 32: Interesting ways Big Data is used today

Open Data Platform

The Open Data Platform Initiative (ODP) is a shared industry effort focused on promoting and advancing the state of Apache Hadoop® and Big Data technologies for the enterprise.

Page 33: Interesting ways Big Data is used today
Page 34: Interesting ways Big Data is used today

Other (interesting) Big Data real use-cases

Page 35: Interesting ways Big Data is used today

Netflix

Netflix collects a lot of data to understand how its users behave and what their preferences are• It collects metrics including what people watch, when they watch, where

they watch, what devices they use, ratings, searches, when users pause or stop watching, etc.

• Netflix made the House of Cards decision by identifying that subscribers who watched the original British version of House of Cards were very likely to watch movies starring Kevin Spacey or directed by David Fincher

• Netflix made ten different versions of the trailer for House of Cards geared towards different audiences▫ Fans of Kevin Spacey watched trailers that were focused on him while people

who liked female-oriented movies saw trailers that highlighted the women in the show.

Page 36: Interesting ways Big Data is used today

Verizon

• 103.3 million wireless customers, 6.2 million Internet users and 5.3 million TV subscribers.

• Data collected:▫ Calls(order flowers) or accessing some pages▫ Locations in City + Roaming▫ Home + Mobile web pages + Television

• Formed a Precision Marketing division – e.g. Event attendance information▫ Migrate from iPhone 5 to iPhone 6 – resulted in a plan data increase or not? ▫ Some migrated from Android to iPhone and huge data plan consumtion 3x-5x more

Notes:• Customers can choose not to participate in the program by going to their privacy choices page on

MyVerizon or by calling 866-211-0874• Verizon’s business and government customers are not part of the Precision program

Page 37: Interesting ways Big Data is used today

The Perfect Milk - Digital Cow - The internet of cows

• Embaded sensos in cow stomachs • If cow is seek, sensor will let a veterinar know while there is

time to treat the disease• Sensor to detect the presence of E.coli bacteria• Vital Herd, a Texas-based start-up - e-Pill - collect information

about the animal: breathing rate, heart rate, temperature, rumination time, rumen acidity and estrogen levels

Page 38: Interesting ways Big Data is used today

The City of Las Vegas

• archaic records and inaccurate information

• took advantage of smart data to develop a living model of its utilities network

• aggregate data from various sources into a single real-time 3D model created with Autodesk technology for both avove and below ground utilities

Page 39: Interesting ways Big Data is used today

Google Flu Trends

Page 40: Interesting ways Big Data is used today

BigData – Big Opportunities

• Big data means big IT job opportunities -- for the right people

Page 41: Interesting ways Big Data is used today

Big Opportunities

• Gartner predicted in 2013 that by 2015, Big Data demand will generate 4.4 million jobs in the IT Industry all around the world.

• 1.9 million IT jobs will be created just in the U.S. That is how Big Data directly affects the IT Industry.

• Only 1/3rd of these jobs will be fulfilled, due to lack of skills in the individuals

What is needed?• A Curious Mind Is Key - The most important qualifications for these positions

aren't academic degrees, certifications, job experience or titles. Rather, they seem to be soft skills: a curious mind, the ability to communicate with nontechnical people, a persistent -- even stubborn -- character and a strong creative bent.

• The CIA is hiring data scientists : “We are looking for curious, creative individuals interested in serving their country through the field of data science.”

Page 42: Interesting ways Big Data is used today
Page 43: Interesting ways Big Data is used today
Page 44: Interesting ways Big Data is used today
Page 45: Interesting ways Big Data is used today
Page 46: Interesting ways Big Data is used today
Page 47: Interesting ways Big Data is used today

“I keep saying that the sexy job in the next 10 years will be statisticians, and I’m not kidding.”  

Hal Varian, chief economist at Google

“Without big data, you are blind and deaf and in the middle of a freeway.”

Geoffrey Moore, author and consultant

Page 48: Interesting ways Big Data is used today

Thank you!Twitter: @danielsarbe