presentatie; meer doen met data

61
Meer doen met data Een mini-college

Upload: finalist-open-it-oplossingen

Post on 12-Aug-2015

13 views

Category:

Documents


1 download

TRANSCRIPT

Meer doen met dataEen mini-college

Big Data“Big Data is a vague term, used loosely, if often, these days. But put simply, the catchall phrase means three things. First, it is a bundle of technologies. Second, it is a potential revolution in measurement. And third, it is a point of view, or philosophy, about how decisions will be — and perhaps should be — made in the future”

— Steve Lohr

http://bits.blogs.nytimes.com/2013/06/19/sizing-up-big-data-broadening-beyond-the-internet

… a bundle of technologies …“Big Data is a vague term, used loosely, if often, these days. But put simply, the catchall phrase means three things. First, it is a bundle of technologies. Second, it is a potential revolution in measurement. And third, it is a point of view, or philosophy, about how decisions will be — and perhaps should be — made in the future”

— Steve Lohr

http://bits.blogs.nytimes.com/2013/06/19/sizing-up-big-data-broadening-beyond-the-internet

… a bundle of technologies …

… a bundle of technologies …Logs Distributed Processing

… a bundle of technologies …Searches Predictive Analytics

http://www.forbes.com/sites/stevensalzberg/2014/03/23/why-google-flu-is-a-failure/

… a bundle of technologies …Open Data data.overheid.nl opendata.cbs.nl

http://www.rtlnieuws.nl/nieuws/binnenland/bekijk-de-cito-score-van-jouw-school

Wat is open data? •De data is openbaar; •Er berust geen auteursrecht of andere rechten van derden op;

•De data zijn bekostigd uit publieke middelen, beschikbaar gesteld voor de uitvoering van die taak;

•De data voldoen bij voorkeur aan ‘open standaarden’ (geen barrières voor het gebruik door ICT-gebruikers of door ICT-aanbieders);

•Open Data is bij voorkeur computer-leesbaar, zodat zoekmachines informatie in documenten kunnen vinden.

… a bundle of technologies …Data Services Web APIs

http://www.apiacademy.co/lessons/api-strategy/what-api

… a bundle of technologies …

https://www.iminds.be/en/succeed-with-digital-research/city-of-things

Location Movement GPS Triangulation Event Stream Processing

… a bundle of technologies …“Paperwork” E.g. Patient Records Natural Language Processing Text Mining

… a revolution in measurement …“Big Data is a vague term, used loosely, if often, these days. But put simply, the catchall phrase means three things. First, it is a bundle of technologies. Second, it is a potential revolution in measurement. And third, it is a point of view, or philosophy, about how decisions will be — and perhaps should be — made in the future”

— Steve Lohr

http://bits.blogs.nytimes.com/2013/06/19/sizing-up-big-data-broadening-beyond-the-internet

… a revolution in measurement …

n = ? Population

Sample

Individual

… a revolution in measurement …

n = N Population

Sample

Individual

… a revolution in measurement …

http://newsroom.cumc.columbia.edu/blog/2015/06/08/data-scientists-find-connections-between-birth-month-and-health/

… a revolution in measurement …

n = 1 Population

Sample

Individual

… a revolution in measurement …

https://gigaom.com/2012/11/20/how-aetna-is-using-big-data-to-improve-patient-health/

Treatments

Prescriptions

Conditions

Claims

Screenings

Personalized Interventions

Patient Safety

Patient Care

Patient Engagement

… a revolution in measurement …

HealthKit

ResearchKit

Personal Genomics Clinical Trials

Quantified Self

Patient Safety Patient Care

Patient Engagement

Big Data“Big Data is a vague term, used loosely, if often, these days. But put simply, the catchall phrase means three things. First, it is a bundle of technologies. Second, it is a potential revolution in measurement. And third, it is a point of view, or philosophy, about how decisions will be — and perhaps should be — made in the future”

— Steve Lohr

http://bits.blogs.nytimes.com/2013/06/19/sizing-up-big-data-broadening-beyond-the-internet

… how decisions will/should be made …

http://tweakers.net/nieuws/103172/belastingdienst-reorganiseert-maar-neemt-1500-mensen-aan-voor-data-analyse.html

… how decisions will/should be made …

@matthew_benham

http://nos.nl/artikel/2037026-midtjylland-verovert-eerste-deense-titel.html

… how decisions will/should be made …

(Big) Data Apps:

• Reports • Dashboards • Infographics

• Benchmarks • Triages • Segmentations • Simulations

AppsBig Data Science

(Big) Data Apps

recommender system

facial recognition

sentiment analysis

market segmentation

Recommender system

http://www.psfk.com/2015/06/personal-shopper-app-personal-assistance-apps-mona-shopping-assistant.html

Facial recognition

https://fortune.com/2015/06/23/facebook-facial-recognition/

Sentiment analysis

http://onlinejournalismblog.com/2015/06/23/classifying-positive-and-negative-quotes-trooclick-offers-an-alternative-to-article-based-journalism/

Customer segmentation

foo

(Big) Data Apps

talent management

resume screening

fraud detection

early warning system

Resume screening

http://www.bizjournals.com/washington/morning_call/2015/06/ceb-acquires-software-company-focused-on-using-big.html

Talent management

http://www.e4talent.com/comparison-off-midfielders-and-due-diligence.html

Early warning system

https://fifa-ews.com/en/

Fraud detection

http://www.marketwatch.com/story/feedzai-and-azul-systems-deploy-real-time-fraud-detection-solution-at-global-leader-in-payment-technology-2015-06-24

(Big) Data Apps

https://www.kaggle.com/wiki/DataScienceUseCases

… etc.

AppsBig Data Science

Data SciencePerform

Exploratory Data Analysis

Setup Data Processing

Pipeline(s)

Build and Deploy Data Apps

(1)

(2)

(3)

Collect StoreProcess

Explore

Query

(Big)Data Apps

http://shop.oreilly.com/product/0636920028529.do

Collect StoreProcess

Explore

Query

(Big)Data Apps

Explore

Explore

http://insidebigdata.com/2014/12/21/big-data-humor-rabbit-duck-conundrum/

Collect StoreProcess

Explore

Query

(Big)Data Apps

Collect

Collect

logs open data web apis event streams

Collect

ETL “Big Data” Other

Collect StoreProcess

Explore

Query

(Big)Data Apps

Process

Process

classification clustering dimensionality reduction regression

Batch Processing Core

Stream Processing Streaming

Graph Processing GraphX

Machine Learning MLlib

Process

Collect StoreProcess

Explore

Query

(Big)Data Apps

Store

Store

relational database

distributed file system

columnar store

graph database triple store

document database

append only database

“tables” “files” “sparse matrices” “relations”“nested structures” “facts”

Collect StoreProcess

Explore

Query

(Big)Data Apps

Query

Query

SQL Hive QL Java API CypherMapReduce Datalog

Collect StoreProcess

Explore

Query

(Big)Data Apps

(Big) Data Apps

Dashboards

http://www.tableau.com/gartner-magic-quadrant-2015

Data SciencePerform

Exploratory Data Analysis

Setup Data Processing

Pipeline(s)

Build and Deploy Data Apps

(1)

(2)

(3)

Collect StoreProcess

Explore

Query

(Big)Data Apps

Meer doen met data

Meer data-toepassingen

recommender system

facial recognition

sentiment analysis

market segmentation

talent management

resume screening

fraud detection

early warning system

Meer datasets

logs open data web apis event streams

Meer datamining

classification clustering dimensionality reduction regression

Meer databases

Meer data-skills

https://beta.oreilly.com/ideas/analyzing-the-analyzers

Meer samenwerking

https://beta.oreilly.com/ideas/analyzing-the-analyzers

Einde