![Page 1: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/1.jpg)
BIGTHETO THEOF THE
ANSWERQUESTIONDATA
eleks DevTalks #1
by Victor Haydin
![Page 2: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/2.jpg)
Gordon Moore
![Page 3: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/3.jpg)
1975 2012Cost of 1 TB storage
$208 000 000 $110
Cost of 1 GFLOPS/s computing facility
$62 000 000 $1.50
Number of network hosts
57 > 1 000 000 000
World’s data amount
~130 GB ~2.9 ZB
![Page 4: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/4.jpg)
1 ZB = 1 000 000 000 000 000 000 000 B(1021)
![Page 5: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/5.jpg)
![Page 6: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/6.jpg)
![Page 7: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/7.jpg)
![Page 8: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/8.jpg)
Commodity Hardware
![Page 9: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/9.jpg)
![Page 10: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/10.jpg)
![Page 11: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/11.jpg)
Wikipedia: “Apache Hadoop is a software framework that supports data-intensive distributed applications”
![Page 12: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/12.jpg)
Main Contributors
![Page 13: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/13.jpg)
![Page 14: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/14.jpg)
HDFS: Hadoop Distributed File System
Hardware Failure
Streaming Data Access
Large Data Sets
Simple Coherency Mode (write-once)
Portability
![Page 15: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/15.jpg)
![Page 16: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/16.jpg)
Moving Computation is cheaper then moving Data
![Page 17: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/17.jpg)
MapReduce
![Page 18: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/18.jpg)
Map(k1,v1) → list(k2,v2)
void map(string key, string value): for each word w in value: yield return KeyValuePair(w, 1);
Reduce(k2, list (v2)) → list(v3)
void reduce(string key, int[] values): int sum = 0; for each pc in values: sum += pc; return KeyValuePair(key, sum);
![Page 19: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/19.jpg)
![Page 20: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/20.jpg)
![Page 21: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/21.jpg)
Demo
![Page 22: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/22.jpg)
Ecosystem
ZooKeeper
![Page 23: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/23.jpg)
3K+ nodes, 36+ PB
45K nodes, 180-200 PB
![Page 24: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/24.jpg)
vspowered by
![Page 25: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/25.jpg)
FutureCore:• HDFS: high-availability and scalability• MapReduce: modularity and alternative ways to perform queriesEcosystem development:• Apache BigTop: consolidation project• HBase, Hive, Pig, ZooKeeper, Avro, Sqoop: stabilizing, interoperability• Incubator: Flume, Ozzie, Whirr
![Page 26: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/26.jpg)
Demo
![Page 27: Hadoop: the Big Answer to the Big Question of the Big Data](https://reader031.vdocuments.net/reader031/viewer/2022013110/546f6a8faf79595c698b475b/html5/thumbnails/27.jpg)
Q&A