(windays 13) microsoft big data platform

23
Microsoft Big Data platforma Luka Lovošević, Marko Tošić MICROSOFT HRVATSKA

Upload: luka-lovosevic

Post on 20-Jun-2015

102 views

Category:

Technology


5 download

DESCRIPTION

Microsoft Big Data Platform Big Data Cloud Azure Hadoop HDInsight Mahout

TRANSCRIPT

Page 1: (Windays 13) Microsoft Big Data Platform

Microsoft Big Data platformaLuka Lovošević, Marko Tošić

MICROSOFT HRVATSKA

Page 2: (Windays 13) Microsoft Big Data Platform

Isključite zvuk telefona

Page 3: (Windays 13) Microsoft Big Data Platform

Sadržaj• Uvod u Big Data• Pregled Microsoft platforme• Hadoop• Demo

Page 4: (Windays 13) Microsoft Big Data Platform

Što je Big Data?

Page 5: (Windays 13) Microsoft Big Data Platform

MICROSOFT CONFIDENTIAL – INTERNAL ONLY

Page 6: (Windays 13) Microsoft Big Data Platform

Što je Big Data?Podaci koji su vam bitni, ali ih tradicionalnim alatima ne možete procesirati.

VOLUME(Količina)

VARIETY (Struktura)

VELOCITY (Brzina)

Page 7: (Windays 13) Microsoft Big Data Platform

Izvori podataka

Telematics Text

Smart-Grid Sensor

Time and Place RFID

Telemetry Social Networks

Page 8: (Windays 13) Microsoft Big Data Platform

Što je Big Data?

Napredna analitika

Podaci u realnom vremenu

Analitika društvenih medija

Kako mogu poboljšati poslovanje ovisno o vremenskim prilikama ili tračevima s društvenih mreža, …?

Što se govori o mojem proizvodu na društvenim mrežama?

Kako da bolje uočim trendove i reagiram na njih?

Page 9: (Windays 13) Microsoft Big Data Platform

Big Data algoritmi

Mining Social-Network Graphs

Finding Similar Items Mining Data Streams Frequent Item Sets

Advertising on the Web

Link Analysis

Recommendation SystemsClustering

c

Page 10: (Windays 13) Microsoft Big Data Platform

Microsoft Big Data platforma

Page 11: (Windays 13) Microsoft Big Data Platform

Microsoft Big Data platforma

SQL Server StreamInsight

Hadoop – HDInsight

(Windows ili Azure)

SQL Server 2012 Parallel Data Warehouse

Self-service BI alati

Page 12: (Windays 13) Microsoft Big Data Platform

Microsoft Big Data platforma

Volume

Varie

t

yVelo

city

pull

push

bigsmall

fk/pk

k/v

SQL Server

PDW

HDInsight

StreamInsight

Page 13: (Windays 13) Microsoft Big Data Platform

Malo više o Hadoopu…

Page 14: (Windays 13) Microsoft Big Data Platform

Što je Hadoop?Platforma za procesiranje velike količine podataka.Apache, open source.Baziran na Google GFS i MapReduce algoritmu.Visoko skalabilan i distribuiran.Jeftini hardver.

2013

Yahoo!

EnterpriseHadoop

Apache projekt

2004 2008 2010 20122006

Page 15: (Windays 13) Microsoft Big Data Platform

Hadoop arhitektura

Page 16: (Windays 13) Microsoft Big Data Platform

Server

ServerServer

MapReduce (i)

Files

Server

Page 17: (Windays 13) Microsoft Big Data Platform

MapReduce (ii)

// Map Reduce function in JavaScript

var map = function (key, value, context) {var words = value.split(/[^a-zA-Z]/);for (var i = 0; i < words.length; i++) {

if (words[i] !== "")context.write(words[i].toLowerCase(),1);}}};

var reduce = function (key, values, context) {var sum = 0;while (values.hasNext()) {sum += parseInt(values.next());

}context.write(key, sum);};

ServerServer

ServerServer

Code

Page 18: (Windays 13) Microsoft Big Data Platform

Primjer za Map Reduce

Page 19: (Windays 13) Microsoft Big Data Platform

HDInsight

Hadoop

Programiranje u .NET-uSecurity, HA & managementPodrška za virtualizacijuIntegracija s Microsoft BI alatimaIsto iskustvo za on-premise i cloud

Hadoop za Windows ServerHadoop za Windows Azure

Page 20: (Windays 13) Microsoft Big Data Platform

Tehnologija oko HDInsight-a

Page 21: (Windays 13) Microsoft Big Data Platform

MahoutBiblioteka skalabilnih algoritama za strojno učenje baziranih na MapReduceu.Vrti se na Hadoop infrastrukturi.

Scenariji korištenja:• Recommendation mining• Clustering• Classification

Page 22: (Windays 13) Microsoft Big Data Platform

Demo

Mahout song recommendation

Page 23: (Windays 13) Microsoft Big Data Platform

Pitanja i odgovori