opensource frameworks and bigdata processing
TRANSCRIPT
![Page 1: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/1.jpg)
Linux and Ubuntu 14.10 Release Conf 1
Big-Data Processing utilizingOpen-Source Technology Stack
By
Amir Sedighi
http://www.linkedin.com/in/amirsedighi@amirsedighi
![Page 2: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/2.jpg)
Linux and Ubuntu 14.10 Release Conf 2
References
● http://www.slideshare.net/BernardMarr/140228-big-data-slide-share?qid=017848e2-9e2a-4dc3-963c-52b6a90fba2a&v=default&b=&from_search=1
● http://www.forbes.com/fdc/welcome_mjx.shtml
● ZYMR Spark Your Real-Time Big Data Analytics
● http://dataconomy.com
● https://datakulfi.wordpress.com/2013/03/27/big-data-open-source-technology-landscape/
● http://www.slideshare.net/andrefaria/big-data-abc?qid=1ac97e4a-4acc-460a-b3f8-9122f7210440&v=qf1&b=&from_search=12
● https://wiki.apache.org/hadoop/PoweredBy
![Page 3: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/3.jpg)
Linux and Ubuntu 14.10 Release Conf 3
Data Explosion
![Page 4: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/4.jpg)
Linux and Ubuntu 14.10 Release Conf 4
Data Explosion
![Page 5: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/5.jpg)
Linux and Ubuntu 14.10 Release Conf 5
● Big-Data is that everything we do is increasingly leaving a digital trace which we (or others) can gather, use and analyze.
– Data Providers● Business Companies● People
![Page 6: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/6.jpg)
Linux and Ubuntu 14.10 Release Conf 6
Volume, Velocity, Variety
● “There was 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days, and the pace is increasing.” Eric Schmidt
![Page 7: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/7.jpg)
Linux and Ubuntu 14.10 Release Conf 7
Big-Data Processing
![Page 8: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/8.jpg)
Linux and Ubuntu 14.10 Release Conf 8
How to provide a Big-Data processing platform using commodity machines?
![Page 9: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/9.jpg)
Linux and Ubuntu 14.10 Release Conf 9
Vertical or Horizontal?
![Page 10: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/10.jpg)
Linux and Ubuntu 14.10 Release Conf 10
Scale Up vs Scale Out
![Page 11: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/11.jpg)
Linux and Ubuntu 14.10 Release Conf 11
Scale Up vs Scale Out
![Page 12: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/12.jpg)
Linux and Ubuntu 14.10 Release Conf 12
Big-Data Processing Open-Source Technology Stack
![Page 13: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/13.jpg)
Linux and Ubuntu 14.10 Release Conf 13
Map-Reduce
![Page 14: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/14.jpg)
Linux and Ubuntu 14.10 Release Conf 14
Hadoop Framework
![Page 15: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/15.jpg)
Linux and Ubuntu 14.10 Release Conf 15
Apache Hadoop Main Projects
![Page 16: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/16.jpg)
Linux and Ubuntu 14.10 Release Conf 16
![Page 17: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/17.jpg)
Linux and Ubuntu 14.10 Release Conf 17
Data Stores
● Data Stores
– KeyValue
– Graph
– Columnar
– Document Store
– In Memory
![Page 18: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/18.jpg)
Linux and Ubuntu 14.10 Release Conf 18
Data Transfer
● Apache Flume
● Apache Sqoop
![Page 19: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/19.jpg)
Linux and Ubuntu 14.10 Release Conf 19
Search
● Elasticsearch
● Apache SolR
![Page 20: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/20.jpg)
Linux and Ubuntu 14.10 Release Conf 20
Messaging and Queuing
● Apache Kafka
● ZeroMQ
![Page 21: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/21.jpg)
Linux and Ubuntu 14.10 Release Conf 21
Log Management
● ELK
● Logstash
● FluentD
![Page 22: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/22.jpg)
Linux and Ubuntu 14.10 Release Conf 22
Stream Processing
● Apache Storm
● Apache Samza
● Apache Spark
![Page 23: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/23.jpg)
Linux and Ubuntu 14.10 Release Conf 23
Machine Learning
● Apache Mahout
● MLLib
● GraphX
![Page 24: Opensource Frameworks and BigData Processing](https://reader035.vdocuments.net/reader035/viewer/2022070323/55a201411a28ab3d268b45b8/html5/thumbnails/24.jpg)
Linux and Ubuntu 14.10 Release Conf 24
Questions?