big data&hadoop
Post on 14-Apr-2017
269 Views
Preview:
TRANSCRIPT
BIGDATA AND HADOOP
By Ram and Raghavendra
BIGDATAWhat is bigdata…..?
Bigdata is the type of data which contains large volume of files in the form of vedios, audios,Pictures, documents etc……….
SOURCES OF BIGDATA
BIG DATA
Face book TWITTER
ONEDRIVEYahoo
Media ,Government,Flipkart etc……………
Types of bigdata
1.Structured data
2.Un structured data
Structured data:
It is the similar type of data which contains same category of files. ex: 1.text files
Text file1 Text file2 ………..
picture1 picture22.pictures ……..
Unstructured data:
It is the combination of different types of data.
vedios audios pictures documents
Three Characteristics of Big Data V3s
Volume• Data
quantity
Velocity• Data Speed
Variety• Data Types
ABOUT BIGDATA• Everyday we are creating 2.5 quintillion bytes of data
• 90% of data in the world has been created in the last two years
• Facebook generates 500+ terabytes of data per a day
Difficults with bigdataIt is too difficult to manage this bigdata for 1. analysis 2. capture 3.curation 4.search5.sharing 6.storage7.transfer 8.visualization and information privacy , with standard database management systems like DBMS and RDBMS.
WHAT IS HADOOP....?
Hadoop Framework Of ToolsIs
Open source(APACHE)
Objective :
Hadoop Running applications on Bigdata
SUPPORTS
Challenging points to Hadoop
velocity varietyvolume
Traditional Approach• Enterprise Approach:
Big Data Processed By Powerful computer
Traditional Approach:• Enterprise Approach:
Big Data Processing limit Powerful computer
Only so much data could be
processed
Breaking the Data
Big Data Is broken into pieces
move computation to the data
Big DataCombined result
COM
PUTA
TIO
N
ARCHITECTURE
MAP REDUCE
FILE SYSTEM(HDFS)
PROJECTS
DISTRIBUTED MODEL
• 1.THESE ARE LOW COST COMPURTERS• 2. WORKS ON LINUX BASED MACHINES
LINUX LINUX LINUX LINUX
TASK TRACKER AND DATA NODES
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
SLAVES
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
SLAVES
TASK TRACKER
NAME DATA
NODE NODE
MASTER JOB TRACKER
COMPONENTS
MAP REDUCE
FILE SYSTEM(HDFS)
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
SLAVES
TASK TRACKER
NAME DATA
NODE NODE
M
Map Reduce
JOB TRACKER
MASTER M
ap
Redu
ce
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
SLAVES
TASK TRACKER
NAME DATA
NODE NODE
M
HDFS
JOB TRACKER
MASTER HD
FS
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
SLAVES
TASK TRACKER
NAME DATA
NODE NODE
M
Batch processing
JOB TRACKER
MASTER
Application Queue
Batch
processi
ng
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
SLAVES
TASK TRACKER
NAME DATA
NODE NODE
M
Job Tracker
JOB TRACKER
MASTER
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
SLAVES
TASK TRACKER
NAME DATA
NODE NODE
M
FAULT TOLERANCE FOR DATA NODE
JOB TRACKER
MASTER HD
FS
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
SLAVES
TASK TRACKER
NAME DATA
NODE NODE
M
FAULT TOLERANCE FOR PROCESSING
JOB TRACKER
MASTER M
AP
REDU
CE
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
TASK TRACKER
DATANODE
TASK TRACKER
DATA NODE
SLAVES
TASK TRACKER
NAME DATA
NODE NODE
M
Master Backup
JOB TRACKER
MASTER
Tables a
re
backe
d up
Easy programming
Do not worryabout
1.Where the file is located
2.How to manage failures
3.How to break competitions into pieces
programmers
4.Scalability
Name•Name was given by Doug cutting•Created by Doug cutting Mike cafarella(yahoo) in 2005•Yahoo donated HADOOP to Apache in 2006
Usage Areas •Social media •Retail•Financial services•Searching tools•Government • Intelligence
Companies• Yahoo• Facebook• Amazon• eBay• American airlines• The NEW YORK Times• Chevron• IBM• Federal Reserve Board
Future outlook
yahoo
By 2015 50% of enterprise data will be processed by Hadoop
Thank you
top related