big data ingestion and storagedownload.microsoft.com/download/0/f/1/0f1b141a-9c69-4bea...2016/04/19...
TRANSCRIPT
![Page 1: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/1.jpg)
Big Data Ingestion and Storage
Darwin Schweitzer
Senior Program Manager
![Page 2: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/2.jpg)
Business is being transformed by three trends
IntelligenceCloudBig Data
![Page 3: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/3.jpg)
Stay ahead of the curve with Cortana Intelligence Suite
Business apps
Custom apps
Sensors and devices
People
Automated systems
Data Intelligence
Cortana Intelligence
Action
Apps
![Page 4: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/4.jpg)
Easily turn data into intelligent action
Action
People
Automated Systems
Apps
Web
Mobile
Bots
Intelligence
Dashboards &
Visualizations
Cortana
Bot
Framework
Cognitive
Services
Power BI
Information
Management
Event Hubs
Data Catalog
Data Factory
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream Analytics
Intelligence
Data Lake
Analytics
Machine
Learning
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data Sources
Apps
Sensors and devices
Data
![Page 5: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/5.jpg)
Big Data Ingestion
Data
Sources
Apps
Sensors
and devices
Data
Information
Management
Event Hubs
Data Factory
Machine Learning
and Analytics
Stream
Analytics
![Page 6: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/6.jpg)
Compose and orchestrate data services at scale
INGEST
SQL
<>
SQL
DATA SOURCES
{ }
SQL
• Create, schedule, orchestrate, and manage data pipelines
• Visualize data lineage
• Connect to on-premises and cloud data sources
• Monitor data pipeline health
• Automate cloud resource management
• Move relational data for Hadoop processing
• Transform with Hive, pig, or custom code
Information
Management
Event Hubs
Data Factory
![Page 7: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/7.jpg)
Ingest events from websites, apps and devices at cloud scale
• Log millions of events per second in near real time
• Connect devices using flexible authorization and throttling
• Use time-based event buffering
• Get a managed service with elastic scale
• Get a managed service with elastic scale
• Reach a broad set of platforms using native client libraries
• Pluggable adapters for other cloud services
Azure
API
Management
Backend Services
Data
Information
Management
Event Hubs
Data Factory
Data sources
Apps
Sensors and devices Event Hubs
SQL Database Machine Learning
HDInsightStorage
Power BIStream Analytics
![Page 8: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/8.jpg)
Big Data Stores
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data
Sources
Apps
Sensors
and devices
Data
Information
Management
Event Hubs
Data Catalog
Data Factory
![Page 9: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/9.jpg)
A hyper-scale repository for big data analytics workloads
• A Hadoop Distributed File System for the cloud
• No fixed limits on file size
• No fixed limits on account size
• Unstructured and structured data in their native format
• Massive throughput to increase analytic performance
• High durability, availability, and reliability
• Azure Active Directory access control
LOB
Applications
SocialDevices
Clickstream
Sensors
Video
Web
Relational
HDInsight
ADL Analytics
Machine Learning
Spark
R
ADL Store
Big Data Stores
SQL Data
Warehouse
Data Lake Store
![Page 10: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/10.jpg)
Elastic data warehouse as a service with enterprise-class features
• Petabyte scale with massively parallel processing
• Independent scaling of compute and storage—in seconds
• Transact-SQL queries across relational and non-relational data
• Full enterprise-class SQL Server experience
• Works seamlessly with Power BI, Machine Learning, HDInsight, and Data Factory
Power BI
App Service SQL Database SQL Data Warehouse
Machine Learning
Hadoop
Intelligent App
Big Data Stores
SQL Data
Warehouse
Data Lake Store
![Page 11: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/11.jpg)
Saas
Azure
PublicCloud
Office 365Office 365
AzureAzure
![Page 12: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/12.jpg)
Example of Cortana Intelligence Suite in action
Data Sources Ingest Prepare Analyze Publish Consume
Sensors and
devices
Stream
Analytics Machine
LearningCortana
Business
apps
SQL Data
Warehouse
Diagnostic
StreamingPower BI
Enterprise data sources
Azure Blob storage
Data Factory: Move data, orchestrate, schedule and monitor
Data Catalog: Register, annotate, understand, discover data sets
HDInsight
Event
Hubs
HDInsight
Machine
Learning
Stream
Analytics
![Page 13: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/13.jpg)
Demo Azure SQLData Warehouse
![Page 14: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/14.jpg)
Machine Learning and Analytics
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data
Sources
Apps
Sensors
and devices
Data Intelligence
Information
Management
Event Hubs
Data Catalog
Data Factory
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
![Page 15: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/15.jpg)
Big data analytics made easy
• Analyze data of any kind and size
• Develop faster, debug and optimize smarter
• Interactively explore patterns in your data
• No learning curve—use U-SQL, Spark, Hive, HBase and Storm
• Managed and supported with an enterprise-grade SLA
• Dynamically scales to match your business priorities
• Enterprise-grade security with Azure Active Directory
• Built on YARN, designed for the cloud
Data Lake Analytics
SQL DW SQL DB Storage BlobsData Lake Store SQL DB in a VM
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
![Page 16: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/16.jpg)
Comprehensive set of managed Apache big data projects
• Scale to petabytes on demand
• Process unstructured and semi-structured data
• Develop in Java, .NET, and more
• Skip buying and maintaining hardware
• Deploy in Windows or Linux
• Spin up an Apache Hadoop cluster in minutes
• Visualize your Hadoop data in Excel
• Easily integrate on-premises Hadoop clusters
Core Engine
Batch
Map Reduce
Script
Pig
SQL
Hive
NoSQL
HBase
Streaming
Storm
In-Memory
Spark
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
![Page 17: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/17.jpg)
![Page 18: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/18.jpg)
https://blogs.technet.microsoft.com/machinelearning/2016/03/29/microsoft-
makes-big-data-analytics-easier-in-the-cloud/
![Page 19: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/19.jpg)
Microsoft Azure Data Lake
YARN
U-SQL
Analytics Service HDInsight
Store
HDFS
![Page 20: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/20.jpg)
Demo Azure Data Lake
![Page 21: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/21.jpg)
If you would like Azure Data Lake Preview Access
Name Azure Email Account Azure SubscriptionIDDarwin Schweitzer [email protected] bcb1d5d2-e6ea-492d-b9c7-xxxxxxxxxxxx
Send email to [email protected]
With your:
NameAzure Email AccountAzure SubscriptionID
To use in HOL or during the Hackathon
![Page 22: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/22.jpg)
https://caqs.azure.net
![Page 23: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/23.jpg)
For tomorrows session Power BI with Big Data Stores
Homework• Go to https://caqs.azure.net/#gallery/datasciencevm
• Sign In with your Azure Subscription account
• Accept the Terms of Use for your Azure Subscription Configure Programmatic Deployment
• Click the Continue button to provisionthe Data Science VM
• Fill in parameters and click Create
• Connect to the VM and loginHow-To Guide to the Data Science Virtual Machine
![Page 24: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/24.jpg)
CAQS Project Naming recommendation
Pattern Id(first two digits) your DOB (next 6 digits mmddyy) Random letter a-z , Random 2 digit number between 00-99
ds
Data Science VM
ds100364a12
100364
Date of Birthmmddyy
a
Random letter a-z Random 3 digit Number 00-99
12
![Page 25: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/25.jpg)
![Page 26: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/26.jpg)
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream
Analytics
Data Lake
Analytics
Machine
Learning
Real-time stream processing in the cloud
• Perform real-time analytics for your Internet of Things solutions
• Stream millions of events per second
• Get mission-critical reliability and performance with predictable results
• Create real-time dashboards and alerts over data from devices and applications
• Correlate across multiple streams of data
• Use familiar SQL-based language for rapid development
Event Hubs
Blob Storage
Stream
Analytics
SQL database
Event Hubs
Power BI
Blob Storage
Table Storage
10
10
10
10
![Page 27: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/27.jpg)
Keep a pulse on your business with live, interactive dashboards
Event Hubs
Stream Analytics
Machine Learning
Storage
SQL databaseHDInsight
Power BI
Power BI
Dashboards &
Visualizations
Power BI
• Analytics for everyone, even non-data experts
• Your whole business on one dashboard
• Create stunning, interactive reports
• Drive consistent analysis across your organization
• Embed visuals in your applications
• Get real-time alerts when things change
![Page 28: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/28.jpg)
![Page 29: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/29.jpg)
![Page 30: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/30.jpg)
DataSnowman
![Page 31: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/31.jpg)
![Page 32: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/32.jpg)
![Page 33: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/33.jpg)
![Page 34: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/34.jpg)
![Page 35: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/35.jpg)
![Page 36: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/36.jpg)
A cloud scale HDFS store designed for parallel processing workloads
Accessible to all HDFS compliant analytics applications and tools
No limits to scale Intelligent data
storage
Enterprise grade
security
![Page 37: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/37.jpg)
![Page 38: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/38.jpg)
Questions or
Comments?
![Page 39: Big Data Ingestion and Storagedownload.microsoft.com/download/0/F/1/0F1B141A-9C69-4BEA...2016/04/19 · • Develop faster, debug and optimize smarter • Interactively explore patterns](https://reader030.vdocuments.net/reader030/viewer/2022041015/5ec60ac2537c19390a39fe4f/html5/thumbnails/39.jpg)
Thank [email protected]