big data in the enterprise: when to use what?

55
Big Data in the Enterprise. When to Use What? Jesus Rodriguez, Tellago, KidoZen, Inc

Upload: jesusmrv

Post on 19-Jan-2015

1.059 views

Category:

Technology


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Big data in the enterprise: When to use what?

Big Data in the Enterprise. When to

Use What?Jesus Rodriguez, Tellago, KidoZen, Inc

Page 2: Big data in the enterprise: When to use what?

Agenda

• Big Data principles• The Hadoop ecosystem• Other big data technologies

Page 3: Big data in the enterprise: When to use what?

About Me

• Co-Founder Tellago, Inc• Co-Founder KidoZen, Inc• Microsoft MVP• Architect Advisor• Investor• Speaker, Author• http://jrodthoughts.com • http://weblogs.asp.net/gsusx • http://kidozen.com

Page 4: Big data in the enterprise: When to use what?

About Tellago• Application development firm focused on big enterprise trends (launched

2008)• Enterprise mobility, cloud computing, augmented reality, modern BI &

big data• Advisor to software companies such as Microsoft or Oracle• American Business Awards(2011) “Best Overall Company of the Year < 100”• American Business Awards(2012) Silver: “Best Computer Services Company

of the Year < 100”, Silver: Best Computer Services Executive of the Year • Inc 500 (114) & other industry awards

Page 5: Big data in the enterprise: When to use what?

Some Housekeeping Rules

• Tellago Technology Updates focused on modern enterprise software trends

• Real world stories • No sales pitch• Leverage GTW to ask questions

Page 6: Big data in the enterprise: When to use what?

We Love Data!

Page 7: Big data in the enterprise: When to use what?

Where all Started?

Page 8: Big data in the enterprise: When to use what?

CAP Theorem

Page 9: Big data in the enterprise: When to use what?

Big Data Opportunity

Page 10: Big data in the enterprise: When to use what?

The Landscape

Page 11: Big data in the enterprise: When to use what?

Or a Bit More Crowded

Page 12: Big data in the enterprise: When to use what?

Or Worse

Page 13: Big data in the enterprise: When to use what?

Hadoop Led the Way

Page 14: Big data in the enterprise: When to use what?

Hadoop Design Principles

• System Shall Manage and Heal Itself• Performance Shall Scale Linearly • Compute Shall Move to Data• Simple Core, Modular and Extensible

Page 15: Big data in the enterprise: When to use what?

The Solution: HDFS + Map Reduce

Page 16: Big data in the enterprise: When to use what?

Mapping

Page 17: Big data in the enterprise: When to use what?

Hadoop Ecosystem

HDFS(Hadoop Distributed File System)

HBase (key-value store)

MapReduce (Job Scheduling/Execution System)

Pig (Data Flow) Hive (SQL)

BI ReportingETL Tools

Avr

o (S

eri

aliz

atio

n)

Zo

oke

ep

r (C

oo

rdin

atio

n)

Sqoop

RDBMS

(Streaming/Pipes APIs)

Page 18: Big data in the enterprise: When to use what?

Reducing

Page 19: Big data in the enterprise: When to use what?

HDFS

Page 20: Big data in the enterprise: When to use what?

Map-Reduce

Page 21: Big data in the enterprise: When to use what?

Relational vs. Hadoop

Page 22: Big data in the enterprise: When to use what?

The Hadoop Ecosystem

Page 23: Big data in the enterprise: When to use what?

WebHDFS

Page 24: Big data in the enterprise: When to use what?

Sqoop

Page 25: Big data in the enterprise: When to use what?

Flume

Page 26: Big data in the enterprise: When to use what?

HBase

Page 27: Big data in the enterprise: When to use what?

Pig

Page 28: Big data in the enterprise: When to use what?

HCatalog

Page 29: Big data in the enterprise: When to use what?

Hive

Page 30: Big data in the enterprise: When to use what?

Ambari

Page 31: Big data in the enterprise: When to use what?

Oozie

Page 32: Big data in the enterprise: When to use what?

Zookeeper

Page 33: Big data in the enterprise: When to use what?

Hadoop Enterprise Architecture

Page 34: Big data in the enterprise: When to use what?

Hadoop is not a silver bullet...

Page 35: Big data in the enterprise: When to use what?

Some Challenges

• Hadoop doesn’t power big data applications• Not a transactional datastore. Slosh back and forth via ETL

• Processing latency• Non-incremental, must re-slurp entire dataset every pass

• Ad-Hoc queries• Bare metal interface, data import

• Graphs• Only a handful of graph problems amenable to MR

Page 36: Big data in the enterprise: When to use what?

Beyond Hadoop

• Percolator(incremental processing)http://research.google.com/pubs/pub36726.html • Dremel(ad-hoc analysis queries)http://research.google.com/pubs/pub36632.html • Pregel (Big graphs)http://dl.acm.org/citation.cfm?id=1807184

Page 37: Big data in the enterprise: When to use what?

Important Big Data Technologies in the Enterprise

Page 38: Big data in the enterprise: When to use what?

Real Time Analytics

Page 39: Big data in the enterprise: When to use what?

Real Time Analytics

• Storm• Hstreaming• StreamBase• IBM Streams• Microsoft StreamInsight

Page 40: Big data in the enterprise: When to use what?

MPP: Massively Parallel Processing

Page 41: Big data in the enterprise: When to use what?

MPP Columnar Stores

• Oracle Exadata• IBM Netezza• Teradata• EMC Greenplum• HP Vertica• ParAccel• Microsoft SQL Server PDW

Page 42: Big data in the enterprise: When to use what?

Oracle & Big Data

Page 43: Big data in the enterprise: When to use what?

Microsoft & Big Data

Page 44: Big data in the enterprise: When to use what?

NoSQL DBs

Page 45: Big data in the enterprise: When to use what?

NoSQL DBs

Page 46: Big data in the enterprise: When to use what?

NewSQL DBs

Page 47: Big data in the enterprise: When to use what?

New SQL / Cloud DB

• VoltDB• NimbusDB• SimpleDB• NuoDB• Clustrix• Totutek

Page 48: Big data in the enterprise: When to use what?

Traditional BI Suites

Page 49: Big data in the enterprise: When to use what?

New SQL / Cloud DB

• Hadoop Support In:• Microsoft SSIS• Informatica Datastage• Talend• Pentaho• Microstrategy , SaaS• Tableau, Qlikview

Page 50: Big data in the enterprise: When to use what?

Big Data & Cloud

Page 51: Big data in the enterprise: When to use what?

Big Data & Cloud

• Hadoop distributions (AWS, Microsoft HDInsight, Cloud Foundry)• Data marketplaces (Factual, Infochimps)• Data visualization (WibiData)• NOSQL as a Service (MongoHQ)

Page 52: Big data in the enterprise: When to use what?

If you are interested on evaluating Big Data in your organization

Page 53: Big data in the enterprise: When to use what?

Tellago Big Data Strategy Session

• 1 day strategy session• Start with a real world scenario• Explore various big data technology vendors• Present a potential technology roadmap• Free

• Emails us at [email protected]

Page 54: Big data in the enterprise: When to use what?

Summary

• The big data ecosystem is super crowded• Hadoop distributions are leading the way in the enterprise• Complementary technologies include:

• NOSQL• New SQL• MPP• Data Visualization

Page 55: Big data in the enterprise: When to use what?

[email protected]

http://www.tellagostudios.com http://jrodthoughts.com

http://twitter.com/#!/jrodthoughtshttp://weblogs.asp.net/gsusx