data success organising for - sveriges riksbank · spotify - data processing, music data modelling...

22
Organising for Data Success Lars Albertsson Data Architect, Schibsted Media Group

Upload: others

Post on 17-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Organising for Data Success

Lars AlbertssonData Architect, Schibsted Media Group

Page 2: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Bio● SICS - test and debug technology for distributed

systems● Sun - high-end server verification● Google - Hangouts, engineering productivity● Recorded Future - data ingestion, data quality● Cinnober - stock exchange engines● Spotify - data processing, music data modelling● Schibsted Media Group - data architect

Page 3: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Path to profit

Page 4: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Big data path to profit

Page 5: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

You start out simpleUser behaviour

Page 6: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Things get complex

User content

Professional content

Ads

Userbehaviour

Systems Ads

System diagnostics

Recommendations

Data-based features

Curated content

Pushing

Business intelligence

Experiments

Exploration

Page 7: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Presentation objectives

Page 8: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Conway’s law

“Organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations.”

Better organise to match desired design, then.

Page 9: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Startup mode

CollectionIngestion Cold store Batch process Analytics

Page 10: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Ingestion

Big corp future

CollectionIngestion

Cold store

Batch process(Real-time process)

Analytics ReportsPushingFeatures

Legacy DBs

User hidden - agilityUser visible - robustness

Page 11: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Don’t drop it - make it one team’s focusReliable path source -> cold store

Minimal complexityHuman & machine fault tolerance

Data is gold

CollectionIngestion Cold store Batch process Analytics

Page 12: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Form teams that are driven by business cases & needForward-oriented -> filters implicitly appliedBeware of: duplication, tech chaos/autonomy, privacy loss

Data pipelines

Page 13: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Data platform, pipeline chains

Common data infrastructureProductivity, privacy, end-to-end agility, complexity

Beware: producer-consumer disconnect

Page 14: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Example case: Spotify

~50M active users, 5-10 TB/day, 20PB100-200 people touch data dailyAutonomous team and tech cultureStabilising data platform

+ Business-driven pipes, enabled teams

- Productivity, end-to-end agility, privacy, stability, duplication, security

Morningcoffee

=

Page 15: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Example case: Schibsted Prod & Tech

10-200M users, 5+TB/day, 0-1PBBlocket, Aftonbladet, Leboncoin, Finn, VG, ...

Grew 1-100 people in 1 year, 20 touch dataBig corp culture, governanceFast-forwarded to platform stage, reverted to autonomy

+ Privacy, security, modern high-level components- Productivity, stability, forward-driven, dependent teams

Page 16: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Survival utilities, technology

Heed ecosystem directionFollow leaders

Twitter, LinkedIn, Facebook, AirBnB, NetflixTechnology has no overlap with yesterday’sKeep up

Page 17: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Survival utilities, ingestion

Data owners should export dataDifficult, needs attentionPull database/API from Hadoop/Spark = DDoS

Quickly hand off incoming data to reliable storageMeasure loss and latency

Page 18: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Survival utilities, workflow

Productive workflow from day oneUpstream easily breaks downstreamNo off-the-shelf tools

Privacy strategy from day oneData spreads like weed

Expect machine and human errorCapability to rebuild from cold store

Page 19: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Parting words

1. Keep things simple

2. Don’t drop data

3. Focus on productive developer workflows

4. Choose right componentsOpen source is saferAvoid rolling your own

Page 20: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Bonus slides

Page 21: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

Personae - important characteristicsArchitect

- Technology updated- Holistic: productivity, privacy- Identify and facilitate governance

Backend developer- Simplicity oriented- Engineering practices obsessed- Adapt to data world

Product owner- Trace business value to

upstream design- Find most ROI through difficult

questions

Manager- Explain what and why- Facilitate process to determine how- Enable, enable, enable

Devops- Always increase automation- Enable, don’t control

Data scientist- Capable programmer- Product oriented

Page 22: Data Success Organising for - Sveriges Riksbank · Spotify - data processing, music data modelling Schibsted Media Group - data architect. ... User content Professional content Ads

+ Operations+ Security+ Responsive scaling- Development workflows- Privacy- Vendor lock-in

Cloud or not?