cloud austin hadoop automation lighting talk 2014.11.18

Post on 02-Jul-2015

119 Views

Category:

Data & Analytics

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Hadoop ETL Automation - How to get to the fun part of big data in the shortest amount of time.

TRANSCRIPT

dataFundamentals

Hadoop Automation in 15 Minutes

Or how to get to the fun stuff before your boss pulls the plug.

ETL is not the Fun Stuff, in Big Data

❖ Analytics

❖ Machine Learning

❖ Spark

❖ [even just Building APIs]

But you can’t do the fun stuff until your corporate data is in place to work against. Chicken and egg problem.

Quick!Before your boss turns off the spigot!

❖ Automate your ETL processes.

❖ Automate your server instances.

What kind of code to Automate?

❖ Clean code. Super clean.

❖ Well designed code.

Other pitfalls?

❖ NIH, Not Invented Here

How to get the fun tasks?

❖ 2 week P.O.C.

❖ Your sample data

Code, Content, Contacts❖ This Slide Deck: http://www.slideshare.net/petecarapetyan/cloud-austin-hadoop-automationlightingtalk141118

❖ or just remember slideshare.net/datafundamentals

❖ Youtube - 11 minute slide-less version of code demo - https://www.youtube.com/playlist?list=PLO_T9AjxEaYeByfqBqHVCmg4GbLFkYCJe

❖ Dev Code

❖ Carrie (ruby UI and generator) https://github.com/datafundamentals/df_ui_carrie

❖ Avro from delimited https://bitbucket.org/datafundamentals/avro_from_delimited

❖ Camel-Avro https://bitbucket.org/datafundamentals/camel-avro-etl

❖ Ops Code - cookbook recipes

❖ https://github.com/datafundamentals

❖ Contact

❖ pete@datafundamentals.com jeff@datafundamentals.com Jeff Twitter @devopsjeff Pete Twitter @appwritercom Site: datafundamentals.com

Be careful! It’s a competitive world out there!

top related