marko.hotti@microsoft€¦ · gartner disclaims all warranties, expressed or implied, with respect...
TRANSCRIPT
![Page 2: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/2.jpg)
* Disclaimer: Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors
with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner
disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. 2
GARTNER MAGIC QUADRANT DW & BI
Business Intelligence and Analytics PlatformsData Warehouse Database Management Systems
![Page 3: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/3.jpg)
The Traditional Data Warehouse
3
![Page 4: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/4.jpg)
![Page 5: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/5.jpg)
Breaking Points of The Traditional Data Warehouse
5
1 2
3
4
![Page 6: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/6.jpg)
Introducing The Modern Data Warehouse
6
Data Sources
Business Intelligence
![Page 7: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/7.jpg)
Microsoft Hadoop VisionInsights to all users by activating new types of data
![Page 8: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/8.jpg)
Diminishing performance
Limitations: Performance and Scale today
Existing Tables (Partitions)
Rowstore
Diminishing Scale as
requirements grow
Non-optimal performance
for many DW queries
Scale UP
![Page 9: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/9.jpg)
SQL Server 2012 Parallel Data Warehouse (PDW)Insights on any data of any size
Next-generation
Performance At ScaleBuilt For Big Data
![Page 10: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/10.jpg)
Manageable Costs
Appliance
Simplicity:
HW + SW
Query
Performance
Scale Out MPP versus Scale Up SMP
“Big Data”
Integration
Updateable
xVelocity
Columnstore
![Page 11: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/11.jpg)
What is Parallel Data Warehouse?
• Shared-nothing parallel database system» Massively parallel processing (MPP)
» A “Control” server that accepts user queries, generates a plan, and distributes operations in parallel to compute nodes
» Multiple “Compute” servers running SQL Server
» A “Management” server for administering the system
» A “Data Movement Service” that facilitates parallel SQL operations
• Delivered as an appliance» Balanced and pre-configured software and industry standard hardware from Dell
or HP
» Single Call Support
» Fastest Time to Market
» Scales from 2 to 56 Nodes
HP Example
![Page 12: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/12.jpg)
Key Design Elements
• Modular Design
• High Density
• Leverage latest Microsoft software
features
» Windows Server 2012 Storage Spaces
» Windows Server 2012 Hyper-V
» SQL Server 2012 xVelocity ColumnStore
HP Example
![Page 13: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/13.jpg)
Ultra Shared Nothing architecture: DistributionLarger Fact Table is Hash Distributed
Across All Compute NodesTD
SD
PD
MD
SF
01-08Time Dim
Date Dim ID
Calendar Year
Calendar Qtr
Calendar Mo
Calendar Day
Store Dim
Store Dim ID
Store Name
Store Mgr
Store Size
Product Dim
Prod Dim ID
Prod Category
Prod Sub Cat
Prod Desc
Sales Facts
Date Dim ID
Store Dim ID
Prod Dim ID
Mktg Camp ID
Qty Sold
Dollars Sold
Mktg Campaign
Dim
Mktg Camp ID
Camp Name
Camp Mgr
Camp Start
Camp End
TD
SD
PD
MD
SF
09-16
TD
SD
PD
MD
SF
17-24
TD
SD
PD
MD
SF
25-32
TD
SD
PD
MD
SF
33-n
![Page 14: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/14.jpg)
• xVelocity in-memory columnstore in PDW columnstore index as primary data store in a scale-out MPP Data Warehouse - PDW V2 Appliance
• Updateable clustered columnstore index (CCI)
• Support for bulk load and insert/update/delete
• Extended data types – decimal/numeric for all precision and scale
• Query processing enhancements for more batch mode processing (for example, Outer/Semi/Antisemi joins, union all, scalar aggregation)
Customer benefits
• Outstanding query performance from in-memory columnstore index
• 600 GB per hour for a single 12-core server
• Significant hardware cost savings due to high compression
• 4–15x compression ratio
• Improved productivity through updateable index
• Ships in PDW V2 appliance and SQL Server 2014
In-Memory Columnstore in PDW V2 & SQL Server 2014
14
![Page 15: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/15.jpg)
Introducing PolyBaseFundamental breakthrough in data processing
Single Query; Structured and Unstructured
• Query and join Hadoop tables with Relational Tables
• Use Standard SQL language
• Select, From Where
Existing SQLSkillset
No ITIntervention
Save Timeand CostsDatabase HDFS
(Hadoop)
SQL Server
2012 PDW
Powered by
PolyBase
SQL
Analyze AllData Types
![Page 16: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/16.jpg)
External Tables» An external table is PDW’s representation of data residing in HDFS
» The “table” (metadata) lives in the context of a SQL Server database
» The actual table data resides in HDFS
CREATE EXTERNAL TABLE table_name ({<column_definition>} [,...n ])
{WITH (LOCATION =‘<URI>’,[FORMAT_OPTIONS = (<VALUES>)])}
[;]
Required to indicate
location of Hadoop clusterOptional format options
associated with parsing of data
from HDFS (e.g. field delimiters
& reject-related thresholds)
![Page 17: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/17.jpg)
Native Query Across Hadoop and PDWParallel Data Import from HDFS into PDW
Persistently storing data from HDFS in PDW tablesFully parallelized via CREATE TABLE AS SELECT (CTAS) with external tables as source table and PDW tables (either distributed or replicated) as destination
CREATE TABLE ClickStream_PDW WITH DISTRIBUTION = HASH(url)
AS SELECT url, event_date, user_IP FROM ClickStream
Retrieval of data in HDFS “on-the-fly”
Enhanced
PDW query
engine
CTAS Results
External Table
DMS
Reader
1
DMS
Reader
N
…
HDFS bridge
Parallel
HDFS Reads
Parallel
Importing
Sensor
&
RFIDWeb
Apps
Unstructured data
Hadoop
Social
Apps
Mobile
Apps
Structured data
Traditional DW
applications
PDW
![Page 18: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/18.jpg)
Sensor
&
RFIDWeb
Apps
Unstructured data
Social
Apps
Mobile
Apps
HDFS data nodes
Native Query Across Hadoop and PDWParallel Data Export from PDW into HDFS• Fully parallelized via CREATE EXTERNAL TABLE AS SELECT (CETAS) with external tables as
destination table and PDW tables as source
• ‘Round-trip of data’ possible with first importing data from HDFS, joining it with relational data, and then exporting results back to HDFS
CREATE EXTERNAL TABLE ClickStream (url, event_date, user_IP)
WITH (LOCATION =‘hdfs://MyHadoop:5000/users/outputDir’, FORMAT_OPTIONS
(FIELD_TERMINATOR = '|')) AS SELECT url, event_date, user_IP FROM ClickStream_PDW
Enhanced
PDW query
engine
CETAS Results
External Table
DMS
Writer
1
DMS
Writer
N
…
HDFS bridge
Parallel
HDFS Writes
Parallel
Reading
Structured data
Traditional DW
applications
PDW
![Page 19: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/19.jpg)
PDW V2.0 Management Dashboard
![Page 20: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/20.jpg)
PDW V2.0 Management Dashboard
![Page 21: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/21.jpg)
PDW V2.0 Management Dashboard
![Page 22: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/22.jpg)
Microsoft Business Intelligence Platform
![Page 23: marko.hotti@Microsoft€¦ · Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular](https://reader034.vdocuments.net/reader034/viewer/2022050201/5f54a5f069af3a28d9699ab2/html5/thumbnails/23.jpg)