data science & data products at neue zürcher zeitung

28
Data Science & Data Products at René Pfitzner, Lead Data Scientist 19th Swiss Big Data User Group Meeting Zürich, 23rd January 2017

Upload: rene-pfitzner

Post on 12-Apr-2017

36 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Data Science & Data Products at Neue Zürcher Zeitung

Data Science &

Data Products at

René Pfitzner, Lead Data Scientist19th Swiss Big Data User Group MeetingZürich, 23rd January 2017

Page 2: Data Science & Data Products at Neue Zürcher Zeitung

I. IntroductionNZZ, media challenges, trends

II. Data Science @NZZGoals, principles, approaches

III. Data Products @NZZOur “stack” & insights & demo

IV. NZZ Companion

Individual news fueled by data science

Outline

Page 3: Data Science & Data Products at Neue Zürcher Zeitung

I. IntroductionMyself, NZZ, media challenges, trends

Page 4: Data Science & Data Products at Neue Zürcher Zeitung

● Lead Data Scientist at NZZ

● media innovation

● algorithmic approaches for

news media

● background in StatPhys

● python, scala, spark, R

Self-Intro

www.renepfitzner.net

@RenePfitznerZH

Page 5: Data Science & Data Products at Neue Zürcher Zeitung
Page 6: Data Science & Data Products at Neue Zürcher Zeitung

Newspaper Revenue: the reality

US newspaper advertising revenue, corrected for inflation

Data: Newspaper Association of America

Graphics:https://commons.wikimedia.org/wiki/File:Naa_newspaper_ad_revenue.svg

Page 8: Data Science & Data Products at Neue Zürcher Zeitung

Well, it should be!

Media = Fourth Estate!

Is this something to worry about?

Page 10: Data Science & Data Products at Neue Zürcher Zeitung

II. Data Science @NZZGoals, principles, approaches

Page 11: Data Science & Data Products at Neue Zürcher Zeitung

Data Science: Goals

Data Science at NZZD

ecis

ion

Mak

ing

Dat

a Pr

oduc

ts

Mar

ketin

g O

ptim

izat

ion

Page 12: Data Science & Data Products at Neue Zürcher Zeitung

Data Science: Data Products

Attempt of a definition:

A data product is a digital product that provides some benefit to a downstream consuming application, incorporating data and data-based methods (e.g. ML).

Page 13: Data Science & Data Products at Neue Zürcher Zeitung

Data Science: Data Products

What good is Data Science, if you cannot put

it into production?

Page 15: Data Science & Data Products at Neue Zürcher Zeitung

Data Science: Data Products

?Provision & Integration

Data Product

Page 16: Data Science & Data Products at Neue Zürcher Zeitung

III. Data Products @NZZOur “stack” & insights & demo

Page 17: Data Science & Data Products at Neue Zürcher Zeitung

Data Products: Our stack

REST API’s

Page 18: Data Science & Data Products at Neue Zürcher Zeitung

Data Products: What is Spark?

● “General engine for fast big

data processing”

● it’s more: parallel computing

framework

● “hadoop on steroids”

→ in-memory!

Page 19: Data Science & Data Products at Neue Zürcher Zeitung

Data Products: How and where?

REST API’s

- on-premise / hosted- gcloud -- dataproc

- gcloud- in parts dockerized- kubernetes

- gcloud & hosted- dockerized; kubernetes- microservice approach

Page 20: Data Science & Data Products at Neue Zürcher Zeitung

Data Products: Article Recomm

- recommendations based on current article

- mixed with advertisement

- article click rate x3

- ad conversion rate x3

Page 21: Data Science & Data Products at Neue Zürcher Zeitung

Data Products: Article Recomm

Network-based

- weighted co-reading net

Trending articles

- clicks- click trend

Topic detection

- word2vec

Page 22: Data Science & Data Products at Neue Zürcher Zeitung

Data Products: Learnings?

● Spark is great for general purpose

● … easily maintained

● … go fast from dev to prod

● Scala forces you to think more & structure better

● cons: development notebooks

● more technical? Talk to me later ...

Page 23: Data Science & Data Products at Neue Zürcher Zeitung

IV. NZZ CompanionIndividual news fueled by data science

Page 24: Data Science & Data Products at Neue Zürcher Zeitung

NZZ News Companion: Facts

● changing news consumption behavior

● vast majority of article clicks emerges from

startpage

● highly volatile startpage

data-enhanced content delivery

Page 25: Data Science & Data Products at Neue Zürcher Zeitung

NZZ News Companion: Prototype

Page 26: Data Science & Data Products at Neue Zürcher Zeitung

NZZ News Companion: DNI

https://www.digitalnewsinitiative.com/

Page 27: Data Science & Data Products at Neue Zürcher Zeitung

Be a news-innovation

beta [email protected]

Page 28: Data Science & Data Products at Neue Zürcher Zeitung

Be a news-innovation beta tester!

[email protected]

[email protected]

@RenePfitznerZH

www.renepfitzner.net