sql server parallel data warehouse: supporting large scale analytics

SQL Server Parallel Data Warehouse:

Supporting Large Scale AnalyticsJosé Blakeley, Software Architect

Database Systems Group, Microsoft Corporation

SQL Server PDW Overview

3/18/2011JHU DIR March 20112

Workload Types

3/18/2011JHU DIR March 2011

Online Transaction Processing (OLTP) Balanced read-update ratio (60%-40%) Fine-grained inserts and updates High transaction throughput e.g., 10s K/s Usually very short transactions e.g., 1-3 tables Sometimes multi-step e.g., financial Relatively small data sizes e.g., few TBs

Data Warehousing and Business Analysis (DW) Read-mostly (90%-10%) Few updates in place, high-volume bulk inserts Concurrent query throughput e.g., 10s K / hr Per query response time < 2 s Snowflake, star schemas are common e.g., 5-10 tables Complex queries (filter, join, group-by, aggregation) Very large data sizes e.g., 10s TB - PB

Day-to-day business

Analysis over historical data

3

SQL Server Parallel Data Warehouse Shared-nothing, distributed, parallel DBMS

Built-in data and query partitioning Provides single system view over a cluster of SQL

Servers Appliance concept

Software + hardware solution Choice of hardware vendors (e.g., HP, Dell, NEC)

Optimized for DW workloads Bulk loads (1.2 – 2.0 TB/hr) Sequential scans (700 TB in 3hr)

Scale from 10 Terabytes to Petabytes 1 rack manages ~40 TB 1 PB will need ~25 racks

3/18/2011JHU DIR March 20114

Hardware Architecture


Compute Nodes

Du

al

Infi

nib

an

d

Control Nodes

Active /

Passive

Spare Compute Node Du

al Fib

er

Ch

an

nel

Client Drivers

(ODBC, OLE-DB,

ADO.NET)

ETL Load Interface

Corporate Backup Solution

Data Center Monitoring

2 Rack Appliance2 Rack Appliance

5

Software Architecture


Compute NodesCompute Nodes

Compute Node

Query Tool

MS BI(AS, RS)

Control Node

3rd Party Tools

DWSQL

Landing Zone Node

Internet

Explorer

SQL Server

DW Authenticati

on

DW Configurati

on

DW Schema

TempDB

SQL Server

User Data

Data Movement Service



PDW Engine

IISAdmin

Console

Data Access (OLEDB, ODBC, ADO.NET,

JDBC)

6

Key Software Functionality


PDW Engine Provides single system image SQL compilation Global metadata and

appliance configuration Global query optimization and

plan generation Global query execution

coordination Global transaction

coordination Authentication and

authorization Supportability (HW and SW

status info via DMVs) Data Movement Service

Data movement across the appliance

Distributed query execution operators

Parallel Loader Runs from the Landing Zone

SSIS or command line tool

Parallel Database Copy High performance data export

Enables Hub-Spoke scenarios

Parallel Backup/Restore Backup files stored on Backup

Nodes

Backup files may be archived into external device/system

7

Query Processing SQL statement compilation

Parsing, validation, optimization Builds an MPP execution plan

A sequence of discrete parallel QE “steps” Steps involve SQL queries to be executed by SQL

Server at each compute node As well as data movement steps

Executes the plan Coordinates workflow among steps Assembles the result set

Returns result set to client

3/18/2011JHU DIR March 20118

3/18/2011JHU DIR March 201104/21/23

18,000,048,306 rows

4,500,000,000 rows

450,000,000 rows

600,000,000 rows

Example DW Schema

30,000,000 rows

25 rows

5 rows

2,400,000,000 rows

SELECT TOP 10L_ORDERKEY,SUM(L_EXTENDEDPRICE*(1-

L_DISCOUNT)) AS REVENUE,O_ORDERDATE,O_SHIPPRIORITY

FROM CUSTOMER,ORDERS,LINEITEM

WHERE C_MKTSEGMENT = 'BUILDING' AND C_CUSTKEY = O_CUSTKEY AND L_ORDERKEY = O_ORDERKEY AND O_ORDERDATE < ‘2010-03-05' AND L_SHIPDATE > ‘2010-03-05'GROUP BY L_ORDERKEY,

O_ORDERDATE,O_SHIPPRIORITY

ORDER BY REVENUE DESC,O_ORDERDATE

9

Example – Schema TPCH


------------------------------------------------------------------------ Customer Table-- distributed on c_custkey----------------------------------------------------------------------CREATE TABLE customer ( c_custkey bigint, c_name varchar(25), c_address varchar(40), c_nationkey integer, c_phone char(15), c_acctbal decimal(15,2), c_mktsegment char(10), c_comment varchar(117)) WITH (distribution=hash(c_custkey)) ;------------------------------------------------------------------------ Orders Table----------------------------------------------------------------------CREATE TABLE orders ( o_orderkey bigint, o_custkey bigint, o_orderstatus char(1), o_totalprice decimal(15,2), o_orderdate date, o_orderpriority char(15), o_clerk char(15), o_shippriority integer, o_comment varchar(79))WITH (distribution=hash(o_orderkey)) ;

------------------------------------------------------------------------ LineItem Table-- distributed on l_orderkey----------------------------------------------------------------------CREATE TABLE lineitem ( l_orderkey bigint, l_partkey bigint, l_suppkey bigint, l_linenumber bigint, l_quantity decimal(15,2), l_extendedprice decimal(15,2), l_discount decimal(15,2), l_tax decimal(15,2), l_returnflag char(1), l_linestatus char(1), l_shipdate date, l_commitdate date, l_receiptdate date, l_shipinstruct char(25), l_shipmode char(10), l_comment varchar(44))WITH (distribution=hash(l_orderkey)) ;

10

Example - Query


SELECT TOP 10 L_ORDERKEY,SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS REVENUE,O_ORDERDATE,O_SHIPPRIORITY

FROM CUSTOMER, ORDERS, LINEITEMWHERE C_MKTSEGMENT = 'BUILDING' AND C_CUSTKEY = O_CUSTKEY AND L_ORDERKEY = O_ORDERKEY AND O_ORDERDATE < ‘2010-03-05' AND L_SHIPDATE > ‘2010-03-05'GROUP BY L_ORDERKEY,

O_ORDERDATE,O_SHIPPRIORITY

ORDER BY REVENUE DESC,O_ORDERDATE

Ten largest “building” orders shipped since March 5, 2010

11

Example – Execution Plan


-------------------------------- Step 1: create temp table at control node------------------------------CREATE TABLE [tempdb].[dbo].[Q_[TEMP_ID_664]]( [l_orderkey] BIGINT, [REVENUE] DECIMAL(38, 4), [o_orderdate] DATE, [o_shippriority] INTEGER );-------------------------------- Step 2: create temp tables at all compute nodes------------------------------CREATE TABLE [tempdb].[dbo].[Q_[TEMP_ID_665]_[PARTITION_ID]]( [l_orderkey] BIGINT, [l_extendedprice] DECIMAL(15, 2), [l_discount] DECIMAL(15, 2), [o_orderdate] DATE, [o_shippriority] INTEGER, [o_custkey] BIGINT, [o_orderkey] BIGINT ) WITH ( DISTRIBUTION = HASH([o_custkey]) );

--------------------------------- Step 3: SHUFFLE_MOVE--------------------------------SELECT [l_orderkey], [l_extendedprice], [l_discount],[o_orderdate], [o_shippriority], [o_custkey], [o_orderkey]FROM [dwsys].[dbo].[orders] JOIN [dwsys].[dbo].[lineitem]ON ([l_orderkey] = [o_orderkey]) WHERE ([o_orderdate] < ‘2010-03-05' AND [o_orderdate] >= ‘2010-09-15 00:00:00.000')INTO Q_[TEMP_ID_665]_[PARTITION_ID]SHUFFLE ON (o_custkey);

-------------------------------- Step 4: PARTITION_MOVE------------------------------SELECT [l_orderkey],sum(([l_extendedprice] * (1 - [l_discount]))) AS REVENUE,[o_orderdate], [o_shippriority]FROM [dwsys].[dbo].[customer] JOIN tempdb.Q_[TEMP_ID_665]_[PARTITION_ID] ON ([c_custkey] = [o_custkey]) WHERE [c_mktsegment] = 'BUILDING'GROUP BY [l_orderkey], [o_orderdate], [o_shippriority]INTO Q_[TEMP_ID_664];-------------------------------- Step 5: Drop temp tables at all compute nodes------------------------------DROP TABLE tempdb.Q_[TEMP_ID_665]_[PARTITION_ID];--------------------------------- Step 6: RETURN result to client--------------------------------SELECT TOP 10 [l_orderkey], sum([REVENUE]) AS REVENUE,[o_orderdate], [o_shippriority]FROM tempdb.Q_[TEMP_ID_664]GROUP BY [l_orderkey], [o_orderdate], [o_shippriority]ORDER BY [REVENUE] DESC, [o_orderdate] ;--------------------------------- Step 7: Drop temp table at control node--------------------------------DROP TABLE tempdb.Q_[TEMP_ID_664];

12

Microsoft Column-store Technology

VertiPaq and VertiScanIn-memory BI (IMBI)Slides by Amir Netz

JHU DIR March 2011 3/18/201115

In-Memory BI Technology

Developed by SQL Analysis Services (OLAP) team

Column-based storage and processing Only touch the columns needed for the query

Compression (VertiPaq) Columnar data is more compressible than row

data Fast in-memory processing (VertiScan)

Filter, grouping, aggregation, sorting

JHU DIR March 2011 3/18/201116

How VertiPaq Compression WorksRead Raw

DataRead Raw

Data

Dictionary EncodingDictionary Encoding

Value Encoding

Value Encoding

Bit PackingBit PackingRun Length Encoding

(RLE)

Run Length Encoding

(RLE)

Phase I: Encoding

Phase II: Compression

Convert to uniform representation (Integer Vectors)

Encoding is per column

Minimize storage space

Compression is per 8M row segments

2x – 10x size reduction2x – 10x size reduction 1x – 2x size reduction1x – 2x size reduction

2x – 4x size reduction2x – 4x size reduction~100x size reduction~100x size reduction

Compression AnalysisCompression Analysis

5%-25% of data5%-25% of data75%-95% of data75%-95% of data

Organize by Columns

Organize by Columns

VertiPaq CompressionVertiPaq Compression

3/18/2011

17

Hybrid RLE

JHU DIR March 2011

3/18/2011JHU DIR March 201118

Region (dbo) (Read-only)RegionId

AreaId

RegionCode

RegionName

TECSPURSL00 (dbo) (Read-only)FTxlatOrgId

CTxlatOrgId

TRCreditedSubsidiaryId

BillingCurrencyID

ProductId

DataSourceId

ManagementReportingId

RecordTypeId

SalesDateId

BillingMonthSalesDateID

ActualQuantityCnt

ActualLicenseCnt

SecondaryLicenseCnt

AdjustedLicenseCnt

ActualRevenueAmt

AdjustedGrossRevenueAmt

JointVentureRevenueAmt

LicenseTransactionItemId

GeographyId

JointVentureActualQuantityCnt

J ointVentureActualLicenseCnt

JointVentureSecondaryLicenseCnt

BillDocTypeID

SalesOfficeID

ReasonID

BillingStatusID

TopParentProductID

ParentProductID

BundleFlagID

ExtractListID

ServiceRevenueAmt

ServiceActualQuantityCnt

ServiceAdjustedRevenueAmt

RevenueCommitmentMultiplier

LicenseCommitmentMultiplier

CommitmentPeriod

DataSourcePlanningFlag

AdvisorID

OrderStatusID

SalesDate (dbo) (Read-only)SalesDateId

FiscalWeekID

FiscalMonthID

FiscalQuarterID

FiscalYearID

FiscalQuarterName

FiscalMonthName

FiscalYearName

CalendarMonthName

FWBeginDate

FWEndDate

CalendarDate

CalendarDateName

MonthRelativeID

FMBeginDate

FMEndDate

TransactionDataPopulated

FiscalSemesterID

FiscalSemesterName

AggregatedSalesDateID

UpperGeography (dbo) (Read-only)CreditedSubsidiaryID

CreditedSubRegionID

CreditedRegionID

CreditedAreaID

CreditedBigAreaID

FiscalYear (dbo) (Read-only)FiscalYearId

FiscalYearName

04/21/23

Star Join Schema

34 rows

436,892,631 rows

41 rows

13,517 rows

118 rows

SELECT FE0.RegionName, FL0.FiscalYearName, SUM (A.ActualRevenueAmt)FROM TECSPURSL00 A JOIN SalesDate L ON A.SalesDateID = L.SalesDateID JOIN UpperGeography UGON A.TRCreditedSubsidiaryId = UG.CreditedSubsidiaryIDJOIN Region FE0 ON UG.CreditedRegionID = FE0.RegionID JOIN FiscalYear FL0 ON L.FiscalYearID = FL0.FiscalYearID GROUP BY FE0.RegionName, FL0.FiscalYearName

APPX – 1TB

Column-Store on APPX

3/18/2011JHU DIR March 201119

Response time < 2s common Smaller variance in response time more

predictable query performance

THANKS!

3/18/2011JHU DIR March 201120

sql server parallel data warehouse: supporting large scale analytics

Documents