concepts on hadoop

17
Hadoop Chris Sharkey today @shark2900

Upload: christopher-sharkey

Post on 13-Apr-2017

336 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Concepts on Hadoop

HadoopChris Sharkeytoday @shark2900

Page 2: Concepts on Hadoop
Page 3: Concepts on Hadoop
Page 4: Concepts on Hadoop

Why use hadoop?• Uses commodity hardware • Stores petabytes of data reliably • Allows for huge distributed computations• Open source project and ecosystem

Page 5: Concepts on Hadoop

Core concepts

Page 6: Concepts on Hadoop

Single Computer

Map Reduce

HDFS

Task TrackerData Node

or

Page 7: Concepts on Hadoop

Cluster

Task TrackerData Node

Job Tracker

Task TrackerData Node

Task TrackerData Node

Name Node

Page 8: Concepts on Hadoop

The ecosystem

Page 9: Concepts on Hadoop

Pig• Programing language • High level for MapReduce• ‘Compiler for MapReduce’

HDFS

MapReduce

Pig

Page 10: Concepts on Hadoop

Hive• SQL like interface • Querying & RD functionality • Familiar to traditional business intelligence operations

HDFS

MapReduce

Hive

Page 11: Concepts on Hadoop

HBase• NoSQL database • Based off of HDFS• Real time updating and access

HDFS

MapReduce

HBas

e

Page 12: Concepts on Hadoop

Zookeeper• Coordination services for many server architects. Distributed application management with added reliability

HDFS

MapReduce

HBas

e

Zook

eepe

r

Page 13: Concepts on Hadoop

Combined System

HDFS

MapReduce

HBas

e Zook

eepe

r

Hive

Pig

Page 14: Concepts on Hadoop
Page 15: Concepts on Hadoop

Summary• Powerful system for ‘big data’ • Commodity hardware • Redundant and reliable • Ecosystem affords modularity • Ecosystem affords relevance • Distributed analytics. Learn form petabytes of data

Page 16: Concepts on Hadoop

Questions?

Page 17: Concepts on Hadoop

Thank you