rapid data analytics @ netflix

Post on 20-Jan-2017

225 Views

Category:

Data & Analytics

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Rapid Data Analytics @ Netflix

Jason FlittnerSenior BI Engineer

Chris StephensSenior Data Engineer

Monisha KanothSenior Data Architect

What We Do

633643DEA @ NetflixContent Analytics

Global Expansion & Content Spend

Freedom & ResponsibilityHighly Aligned, Loosely CoupledContext, not Control

Culture + Technology

CourageJudgementHonestyCommunication CuriosityPassionInnovationImpactSelflessness

Parquet FF

Storage Compute Tools BI

AWSS3

(Hadoop clusters)

Deploy Fast, Fix Faster

● Improve & Iterate vs Perfect● Have a Rollback Plan Ready

Develop Business Logic not ETL

● Think in Patterns

The Path of Least Resistance is the Right Path

● Make Smart Engineering Tradeoffs

The Clock starts Ticking when you Deploy

● Every Data Pipeline comes with an Expiration Date

● Deprecate and Prune

No Man’s Land is Expensive

● Ownership

Be a Noob

● User Groups

What You Could Doin your Data Warehouse

Let everyone drop tables in production

Cost / BenefitConscientious people make mistakes,but not very often

Data warehouse is not an operational system

What happens if a table is accidentally dropped?● Do you have backups?● How quickly can you restore a table?

Is the benefit of worth the tax on every data / analytical product your team produces?

We have some protection

In Hive, all tables are external tables pointing to S3 locations.

ETL writes a new “batch” of data then updates the metastore.

s3://[bucket]/hive/schema.db/table/batchid=1459364911

ALTER TABLE table SET LOCATION [path to new batch ID];

DROP TABLE does not delete any data.

In our MPP databases, we have a procedure for upgrading and downgrading our privileges.

CALL admin.UpgradePrivileges('me')

Lasts for several hours. Usage is logged.

Accidents? Restore from backups. Or reload from Hive.

When other teams are ready to move to production ...

We’re done. And moving on to the next thing.

You can trust your people to work the same way.

Don’t have an “on call”(Use a “first responder” instead)

Everyone on the team takes a shift: both BI and data engineers (even managers every once in a while!)

First Responder = the first one to respond

● handles most common failures (restarting jobs)● reaches out directly to ETL owner if escalation is required● handles communication surrounding ETL delays

Goal is to protect the team’s time and focus

How we do this

● visually define what needs attention and what doesn’t○ “above the line” vs “below the line”

● email alerts for “above the line” jobs that take longer than normal

● playbook for fixing common stuff○ the more complete your entries are, the less you get

called!

Have a very clear sense of what is urgent, and what isn’t

Treating every failure like it’s urgent bleeds your team of the time they need to do work

Build your processes so they can be ignored for 3 days

● don’t load data if it’s incomplete● reprocess fact data for several days instead of picking up

the latest

Gives you the freedom to judge whether a failure is worth an interruption

Everybody owns ETL(when they need to)

BI engineer needs data structured a certain way for a report

Many environments:

● Ask a data engineer to build them a table

Our environment:

● Let them schedule a Hive script and adjust as necessary

We focus on centers of excellence, not role boundaries

More Examples:

● our BI engineers use Python to automate tasks

● our data engineers have Tableau licenses, and use them for quick visualizations and report deployments

For small tasks, this helps us avoid the overhead of interruption and knowledge transfer

What You Could Do on the Front-end

Parquet FF(Hadoop clusters)

Storage Compute Data Interface Data Access, Analytics and Visualization

AWS S3

Do Not Limit Yourself to Conventional Tools

○ Tableau - Data Visualization and Dashboards○ MicroStrategy - Dynamic SQL and Metadata○ Python or Custom Reporting - Emails

Give your BI Engineers Superpowers (like this guy)

○ Provide a data platform○ BI + Data Engineering○ Context not Requirements○ Be early adopters

Simple isOften Best

Dismantle your Data Warehouse Team

○ Integrate with the business○ Data Engineering and Data Science

teams○ Open and honest communication

Fast is better than perfect

○ Build, iterate… repeat○ How to handle adhocs○ Freedom - make the right call○ Responsibility - Ownership

EncourageHacking

Questions?

Want to chill with us!?jobs.netflix.com

top related