olivia klose | technical evangelist, microsoft ...download.microsoft.com/.../1_intro.pdf · 1 intro...
TRANSCRIPT
![Page 1: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/1.jpg)
Olivia Klose | Technical Evangelist, Microsoft
@oliviaklose
blogs.technet.com/oliviaklose
![Page 2: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/2.jpg)
Meet Olivia | @oliviaklose
• Microsoft Technical Evangelist– Fokus: Big Data, Hadoop, Hive, etc.
• Machine Learning– Informatik mit Mathematik an der University of Cambridge, TU
München und dem IIT Bombay
– Medizinische Bildgebung
– Nuklearmedizinische Klinik in München
• IT Erfahrungen in Großunternehmen
![Page 3: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/3.jpg)
Agenda
Modul Inhalt
1 Intro & Big Data Buzzwords
- Big Data, Hadoop, MapReduce, HDInsight
2 Big Data Szenario: Twitter-Analyse
3 Manage: Daten extrahieren und speichern- Windows Azure Blob Storage, Windows Azure SQL Database, VM
4 Analyse: Daten analysieren
- HDInsight, Hive
5 Insights: Erkenntnisse aus Daten gewinnen
- ODBC Treiber, PowerPivot & PowerView
![Page 4: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/4.jpg)
Modul 1
Intro & Big Data Buzzwords
• Big Data
• Hadoop
• MapReduce
• HDInsight
![Page 5: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/5.jpg)
Was ist Big Data?
Modul 1 – Intro & Big Data Buzzwords
![Page 6: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/6.jpg)
![Page 7: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/7.jpg)
Der Large Hadron Collider
(Teilchenbeschleuniger am CERN)
produziert 15 PB/Jahr
http://home.web.cern.ch/about/computing
![Page 8: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/8.jpg)
Aber was, wenn ich keinen
Large Hadron Collider besitze…
![Page 9: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/9.jpg)
Großfabrik
Fuhrpark
Smart Grids
Ökostrom
Aktienbörse
Host Protocols
Rechenzentren
Serverfarm
Google Analytics
…
Vielleicht Daten von…
![Page 10: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/10.jpg)
“Big data is a term describing
the storage and analysis of
large and/or complex data sets
using a series of techniques
including, but not limited to:
NoSQL, MapReduce and machine learning.”
http://www.technologyreview.com/view/519851/the-big-data-conundrum-how-to-define-it/
arxiv.org/abs/1309.5821
![Page 11: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/11.jpg)
“Big data is high-volume,
high-velocity and/or
high-variety information assets
that require new forms of
processing to enable
enhanced decision making,
insight discovery and
process optimization.”
Gartner ‘s Definition of Big Data
Laney, Douglas. The Importance of “Big Data”: A Definition. Gartner. Abgerufen 21. Juni 2012.
![Page 12: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/12.jpg)
Die 3 Vs
MB
GB
TB
PB
batch
periodic
real
time
table
data
base
un-
struc-
tured
web
Big Data, Gesellschaft für Informatik, 2013,http://www.gi.de/service/informatiklexikon/
detailansicht/article/big-data.html
![Page 13: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/13.jpg)
In eigenen Worten…
Big Data umfasst
große und unstrukturierte
Datenvolumen aus
unterschiedlichen Datenquellen,
die in kürzester Zeit erzeugt
und analysiert werden.
![Page 14: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/14.jpg)
Was ist Hadoop?
Modul 1 – Intro & Big Data Buzzwords
![Page 16: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/16.jpg)
Historie
2002 2004 2006
Nutch
Doug Cutting | New York Times, 16 March 2009,
http://www.nytimes.com/imagepages/2009/03/16/business/17cloud.2.inline.ready.html
![Page 17: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/17.jpg)
Historie
2002 2004 2006
Nutch
GFS NDFS
Doug Cutting | New York Times, 16 March 2009,
http://www.nytimes.com/imagepages/2009/03/16/business/17cloud.2.inline.ready.html
![Page 18: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/18.jpg)
Historie
2002 2004 2006
Nutch
GFS NDFS
MapReduceNutch
MapReduce
Doug Cutting | New York Times, 16 March 2009,
http://www.nytimes.com/imagepages/2009/03/16/business/17cloud.2.inline.ready.html
![Page 19: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/19.jpg)
Historie
2002 2004 2006
Nutch
GFS NDFS
MapReduceNutch
MapReduce Hadoop
Doug Cutting | New York Times, 16 March 2009,
http://www.nytimes.com/imagepages/2009/03/16/business/17cloud.2.inline.ready.html
![Page 20: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/20.jpg)
Hadoop Komponenten
![Page 21: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/21.jpg)
MapReduce
![Page 22: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/22.jpg)
Was ist HDInsight?
Modul 1 – Intro & Big Data Buzzwords
![Page 23: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/23.jpg)
HDInsight
![Page 24: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/24.jpg)
LegendRed = Core HadoopBlue = Data processingGreen = PackagesDark blue = Microsoft integration points and value addsOrange = Data Movement
HDInsight / Hadoop architecture
Distributed Storage
(HDFS)
Distributed Processing
(MapReduce)
![Page 25: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/25.jpg)
Agenda
Modul Inhalt
1 Intro & Big Data Buzzwords
- Big Data, Hadoop, MapReduce, HDInsight
2 Big Data Szenario: Twitter-Analyse
3 Manage: Daten extrahieren und speichern- Windows Azure Blob Storage, Windows Azure SQL Database, VM
4 Analyse: Daten analysieren
- HDInsight, Hive
5 Insights: Erkenntnisse aus Daten gewinnen
- ODBC Treiber, PowerPivot & PowerView
![Page 26: Olivia Klose | Technical Evangelist, Microsoft ...download.microsoft.com/.../1_Intro.pdf · 1 Intro & Big Data Buzzwords - Big Data, Hadoop, MapReduce, HDInsight 2 Big Data Szenario:](https://reader033.vdocuments.net/reader033/viewer/2022053009/5f0c6a537e708231d43549c1/html5/thumbnails/26.jpg)
©2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Office, Azure, System Center, Dynamics and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.