hadoop 101 (v1) (20150730)
TRANSCRIPT
Hadoop 101Big Data Technology
What is Big Data?
Big Data is ...
- A Technology that capable of handling a:- massive and complex data (petabytes+)- stream of data in (near) real time- extremely large infrastructure
What is Hadoop?- Hadoop is:
- scalable.- a “Framework”.- not a drop in replacement
for RDBMS.- great for pipelining
massive amounts of data to achieve the end result.
- Hadoop was created by Doug Cutting and Mike Cafarella. Cutting, who was working at Yahoo! at the time, named it after his son’s toy elephant.
- Yahoo! has the single largest Hadoop cluster in the world (4,500 nodes). (according to the Apache Hadoop website)
- Yes, there is a Hadoop GPU Framework available!
Hadoop Fun Facts
Hadoop Core Components
Hadoop 1.x- HDFS (storage)
- NameNode- DataNode- Secondary NameNode*
- MapReduce (processing)- JobTracker- TaskTrackers- JobHistoryServer
Hadoop Core Components (Details)
Hadoop 2.x- HDFS (storage)
- NameNode- DataNode- Secondary NameNode*
- YARN (processing)- ResourceManager- ApplicationMaster- NodeManager- JobHistoryServer
Hadoop Compatible Components (1)
- Manipulate/Querying Data:- Apache Hive (SQL like query)- Cloudera Impala (SQL like query)- Apache Pig (Scripting based query)
- MapReduce (Library)
- Key Value Storage- HBase- Cassandra
Hadoop Compatible Components (2)
- Message Queueing:- Kafka (Similar to RabbitMQ, Pub-Sub, etc)
- Advanced Processing- Spark (Up to 100x faster than MapReduce)
- Scheduler/Workflow- Oozie (Similar to Crontab)
Hadoop Compatible Components (3)
- Data Export/Import:- Flume (Stream: Text Files/Logs to HDFS)- Sqoop (RDBMS to HDFS or vice versa)
and many more.. :)
Most Popular Hadoop Distributions
source: datanami.com
Real Example of Using Hadoop* (1)
Real Example of Using Hadoop* (2)
Real Example of Using Hadoop* (3)
(near) Real Time Analytics
QA Session
Join our Linkedin Group
Big Data Indonesiahttps://www.linkedin.com/grp/home?gid=6970225
Hadoop 101Thank You # EOFUnless stated, all images used in this slides belong to their respective owners.