tuning and optimizing u-sql queries (sqlpass 2016)

59
Tuning and Optimizing U-SQL queries for maximum performance Michael Rys, Principal Program Manager, Microsoft @MikeDoesBigData, [email protected] AD-315-M AD-400-M

Upload: michael-rys

Post on 16-Apr-2017

483 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Tuning and Optimizing U-SQL

queriesfor maximum performance

Michael Rys, Principal Program Manager, Microsoft

@MikeDoesBigData, [email protected]

AD-315-MAD-400-M

Page 2: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Please silence cell phones

Page 3: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

MICROSOFT CONFIDENT IAL – INTERNAL ONLY

Session Objectives And TakeawaysSession Objective(s): • Understand the U-SQL Query execution• Be able to understand and improve U-SQL job performance/cost• Be able to understand and improve the U-SQL query plan• Know how to write more efficient U-SQL scripts

Key Takeaways:• U-SQL is designed for scale-out• U-SQL provides scalable execution of user code• U-SQL has a tool set that can help you analyze and improve your scalability, cost and

performance

Page 4: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Agenda• Job Execution Experience and Investigations

Query ExecutionStage GraphDryad crash courseJob MetricsResource Planning

• Job Performance AnalysisAnalyze the critical pathHeat MapCritical PathData Skew

• Tuning / OptimizationsCost OptimizationsData PartitioningPartition EliminationPredicate PushingColumn PruningSome Data HintsUDOs can be evilINSERT optimizations

U-SQL Query Execution and Performance Tuning

Page 5: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

U-SQL Query Execution

Page 6: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Job Scheduler &

Queue

Fron

t-End

Ser

vice

6

Vertex Execution

Consume

Overall U-SQL Batch Job Execution Lifetime

LocalStorage

Data Lake Store

Author

Plan

Compiler Optimizer

Vertexes running in

YARN Containers

U-SQLRuntime

OptimizedPlan

Vertex SchedulingOn containers

Job Manager

USQL Compiler Service &

USQL Catalog

Page 7: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

12Expression-flow Programming StyleAutomatic "in-lining" of U-SQL expressions – whole script leads to a single execution model.

Execution plan that is optimized out-of-the-box and w/o user intervention.

Per job and user driven level of parallelization.

Detail visibility into execution steps, for debugging.

Heatmap like functionality to identify performance bottlenecks.

Page 8: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

U-SQL Compilation Process

C#

C++

AlgebraOther files

(system files, deployed resources)

managed dllUnmanaged

dll

Compilation output (in job folder)

Compiler & Optimizer

U-SQL Metadata Service

Deployed to Vertices

Page 9: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Job Scheduling and Execution

Page 10: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Analyzing a job

Page 11: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Parallelism1000 (ADLAUs)

Work composed of12K Vertices

1 ADLAU currently maps to a VM with 2 cores and 6 GB of memory

Page 12: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

U-SQL Query Execution Physical plans vs. Dryad stage graph…

Page 13: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Stage Details252 Pieces of

work

AVG Vertex execution time

4.3 Billion rows

Data Read & Written

Super Vertex = Stage

Page 14: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

16

U-SQL Query ExecutionRedefinition of big-data…

Page 15: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

17

U-SQL Query ExecutionRedefinition of big-data…

Page 16: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

18

U-SQL Performance AnalysisAnalyze the critical path, heat maps, playback, and runtime metrics on every vertex…

Page 17: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Tuning U-SQL Jobs

Page 18: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Tuning for Cost

Efficiency

Page 19: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Dips down to 1 active vertex at these times

Page 20: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Smallest estimated time when given 2425 ADLAUs

1410 seconds= 23.5 minutes

Page 21: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Model with 100 ADLAUs

8709 seconds= 145.5 minutes

Page 22: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Tuning for Performance – Data Partitioning during processing and storage

Page 23: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

MICROSOFT CONFIDENT IAL – INTERNAL ONLY

Data Storage• Files• Tables

• Unstructured Data in files• Files are split into 250MB extents• 4 extents per vertex -> 1GB per vertex

• Different file content formats:• Splittable formats are parallelizable: • row-oriented (CSV etc) • Where data does not span extents• Non-splittable formats cannot be parallelized:• XML/JSON• Have to be processed in single vertex extractor

with atomicFileProcessing=true.• Use File Sets to provide semantic partition

pruning• Tables• Clustered Index (row-oriented) storage• Vertical and horizontal partitioning• Statistics for the optimizer (CREATE STATISTICS)• Native scalar value serialization

Page 24: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Querying unstructured

data

Page 25: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

// Unstructured Files (24 hours daily log impressions)@Impressions = EXTRACT ClientId int, Market string, OS string, ... FROM @"wasb://ads@wcentralus/2015/10/30/{*}.nif" FROM @"wasb://ads@wcentralus/2015/10/30/{Market}_{*}.nif" ;

// …

// Filter to by Market@US = SELECT * FROM @Impressions WHERE Market == "en" ;

U-SQL OptimizationsPartition Elimination – Unstructured Files

Partition Elimination• Even with unstructured files!• Leverage Virtual Columns (Named)• Avoid unnamed {*}

• WHERE predicates on named virtual columns• That binds the PE range during compilation time• Named virtual columns without predicate = warning

• Design directories/files with PE in mind• Design for elimination early in the tree, not in the

leaves

Extracts all files in the folder

Post filter = pay I/O cost to drop most data

PE pushes this predicate to the EXTRACT

EXTRACT now only reads “en” files!

en_10.0.nif

en_8.1.nif

de_10.0.nif

jp_7.0.nif

de_8.1.nif

../2015/10/30/

Page 26: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Tuning for Performance – Structured Data

Page 27: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

How many clicks per domain?

@rows = SELECT Domain, SUM(Clicks) AS TotalClicks FROM @ClickData GROUP BY Domain;

Page 28: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

File

Read Read

Partition Partition

Full Agg

Write

Full Agg

Write

Full Agg

Write

Read

Partition

Partial Agg

Partial Agg

Partial Agg

CNN,FB,WH

EXTENT 1 EXTENT 2 EXTENT 3

CNN,FB,WH

CNN,FB,WH

U-SQL Table Distributed by Domain

Read Read

Full Agg Full Agg

Write Write

Read

Full Agg

Write

FBEXTENT 1

WHEXTENT 2

CNNEXTENT 3

Expensive!

Page 29: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Scaling out with Distributions

Page 30: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Data PartitioningTables

Table Partitioning and Distribution• Fine grained (horizontal) partitioning/distribution

• Distributes within a partition (together with clustering) to keep same data values close

• Choose for:• Join alignment, partition size, filter selectivity, partition

elimination• Coarse grained (vertical) partitioning

• Based on Partition keys• Partition is addressable in language• Query predicates will allow partition pruning• Choose for data life cycle management, partition elimination

Distribution Scheme

When to use?

HASH(keys) Automatic Hash for fast item lookupDIRECT HASH(id) Exact control of hash bucket valueRANGE(keys) Keeps ranges togetherROUND ROBIN To get equal distribution (if others give skew)

Page 31: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Partitions, Distributions and ClustersTABLE T ( id … , C … , date DateTime, … , INDEX i CLUSTERED (id, C) PARTITIONED BY (date) DISTRIBUTED BY HASH(id) INTO 4)

PARTITION (@date1) PARTITION (@date2) PARTITION (@date3)

HASH DISTRIBUTION 1

HASH DISTRIBUTION 2

HASH DISTRIBUTION 3

HASH DISTRIBUTION 1

HASH DISTRIBUTION 1HASH DISTRIBUTION 2

HASH DISTRIBUTION 3

HASH DISTRIBUTION 4 HASH DISTRIBUTION 3

C1

C2

C3

C1

C2

C4

C5

C4

C6

C6

C7

C8C7

C5

C6

C9

C10

C1

C3

/catalog/…/tables/Guid(T)/

Guid(T.p1).ss Guid(T.p2).ss Guid(T.p3).ss

LOGICAL

PHYSICAL

Page 32: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Benefits ofClustered Index in Distribution

Benefits• Design for most frequent/costly queries• Manage data skew in distribution bucket• Provide locality of same data values• Provide seeks and range scans for query predicates

(index lookup)

Clustered index in tables is mandatory, chose according to desired benefits

Pro Tip: Distribution keys should be prefix of Clustered Index keys: Especially for RANGE distribution

Optimizer will make use of global ordering then: If you make the RANGE distribution key a prefix of the index key, U-SQL will repartition on demand to align any UNIONALLed or JOINed tables or partitions! 

Split points of table distribution partitions are choosen independently, so any partitioned table can do UNION ALL in this manner if the data is to be processed subsequently on the distribution key.

Page 33: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Benefits of Distribution in Tables

Benefits• Design for most frequent/costly queries• Manage data skew in partition/table• Manage parallelism in querying (by number of

distributions)• Manage minimizing data movement in joins• Provide distribution seeks and range scans for

query predicates (distribution bucket elimination)

Distribution in tables is mandatory, chose according to desired benefits

Page 34: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Benefits of Partitioned Tables

Benefits• Partitions are addressable• Enables finer-grained data lifecycle management at

partition level• Manage parallelism in querying by number of

partitions• Query predicates provide partition elimination

• Predicate has to be constant-foldable

Use partitioned tables for • Managing large amounts of incrementally growing

structured data • Queries with strong locality predicates

• point in time, for specific market etc• Managing windows of data

• provide data for last x months for processing

Page 35: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Partitioned tablesUse partitioned tables for querying parts of large amounts of incrementally growing structured data

Get partition elimination optimizations with the right query predicates

Creating partition tableCREATE TABLE PartTable(id int, event_date DateTime, lat float, long float , INDEX idx CLUSTERED (vehicle_id ASC) PARTITIONED BY(event_date) DISTRIBUTED BY HASH (vehicle_id) INTO 4);

Creating partitionsDECLARE @pdate1 DateTime = new DateTime(2014, 9, 14, 00,00,00,00,DateTimeKind.Utc); DECLARE @pdate2 DateTime = new DateTime(2014, 9, 15, 00,00,00,00,DateTimeKind.Utc); ALTER TABLE vehiclesP ADD PARTITION (@pdate1), PARTITION (@pdate2);

Loading data into partitions dynamicallyDECLARE @date1 DateTime = DateTime.Parse("2014-09-14"); DECLARE @date2 DateTime = DateTime.Parse("2014-09-16"); INSERT INTO vehiclesP ON INTEGRITY VIOLATION IGNORE SELECT vehicle_id, event_date, lat, long FROM @data WHERE event_date >= @date1 AND event_date <= @date2;

• Filters and inserts clean data only, ignore “dirty” data

Loading data into partitions staticallyALTER TABLE vehiclesP ADD PARTITION (@pdate1), PARTITION (@baddate);

INSERT INTO vehiclesP ON INTEGRITY VIOLATION MOVE TO @baddate SELECT vehicle_id, lat, long FROM @data WHERE event_date >= @date1 AND event_date <= @date2;

• Filters and inserts clean data only, put “dirty” data into special partition

Page 36: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

@Impressions = SELECT * FROM searchDM.SML.PageView(@start, @end) AS PageView OPTION(SKEWFACTOR(Query)=0.5) ;

// Q1(A,B)@Sessions = SELECT ClientId, Query, SUM(PageClicks) AS Clicks FROM @Impressions GROUP BY Query, ClientId ;

// Q2(B)@Display = SELECT * FROM @Sessions INNER JOIN @Campaigns ON @Sessions.Query == @Campaigns.Query ;

U-SQL OptimizationsDistributions – Minimize (re)partitions

Input must be distributed on: (Query)

Input must be distributed on:(Query) or (ClientId) or (Query,

ClientId)

Optimizer wants to distribute only onceBut Query could be skewed

Data Distribution• Re-Distributing is very expensive• Many U-SQL operators can handle multiple distribution

choices• Optimizer bases decision upon estimations

Wrong statistics may result in worse query performance

Page 37: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

// Unstructured (24 hours daily log impressions)@Huge = EXTRACT ClientId int, ... FROM @"wasb://ads@wcentralus/2015/10/30/{*}.nif" ;

// Small subset (ie: ForgetMe opt out)@Small = SELECT * FROM @Huge WHERE Bing.ForgetMe(x,y,z) OPTION(ROWCOUNT=500) ;

// Result (not enough info to determine simple Broadcast join)@Remove = SELECT * FROM Bing.Sessions INNER JOIN @Small ON Sessions.Client == @Small.Client ;

U-SQL OptimizationsDistribution - Cardinality

Broadcast JOIN right?

Broadcast is now a candidate.

Wrong statistics may result in worse query performance=> CREATE STATISTICS

Optimizer has no stats this is small...

Page 38: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Tuning for Performance – Handling Data Skew

Page 39: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

MICROSOFT CONFIDENT IAL – INTERNAL ONLY

What is Data Skew?• Some data points

are much more common than others

• data may be distributed such that all rows that match a certain key go to a single vertex

• imbalanced execution, vertex time out.

05,000,000

10,000,00015,000,00020,000,00025,000,00030,000,00035,000,00040,000,000 Population by State

Page 40: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

MICROSOFT CONFIDENT IAL – INTERNAL ONLY

Low Distinctiveness Keys• Keys with small

selectivity can lead to large vertices even without skew

@rows = SELECT Gender, AGG<MyAgg>(…) AS Result

FROM @HugeInputGROUP BY Gender;

Gender==Male Gender==Female

@HugeInput

Vertex 0 Vertex 1

Page 41: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Why is this a problem?Vertexes have a 5 hour runtime limit!Your UDO or join may excessively allocate memory.Your memory usage may not be obvious due to garbage collection

Page 42: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

MICROSOFT CONFIDENT IAL – INTERNAL ONLY

Addressing Data Skew/Low distinctiveness• Improve data partition sizes:• Find more fine grained keys, eg, states and congressional districts or ZIP

codes• If no fine grained keys can be found or are too fine-grained: use ROUND

ROBIN distribution• Write queries that can handle data skew:• Use filters that prune skew out early• Use Data Hints to identify skew and “low distinctness” in keys:• SKEWFACTOR(columns) = x

provides hint that given columns have a skew factor x between 0 (no skew) and 1 (very heavy skew))• DISTINCTVALUE(columns) = n

let’s you specify how many distinct values the given columns have (n>1)• Implement aggregation/reducer recursively if possible

Page 43: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Vertex 4

Vertex 3 Vertex 2 Vertex 1

Vertex 1

Non-Recursive vs Recursive SUM1 2 3 4 5 6 7 8 36

1 2 3 4 5 6 7 8

6 15 15

36

Page 44: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

U-SQL Partitioning during Processing Data Skew

Page 45: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

U-SQL PartitioningData Skew – Recursive Reducer

// Metrics per domain@Metric = REDUCE @Impressions ON UrlDomain USING new Bing.TopNReducer(count:10) ;

// …

Inherent Data Skew

[SqlUserDefinedReducer(IsRecursive = true)]public class TopNReducer : IReducer{ public override IEnumerable<IRow> Reduce(IRowset input, IUpdatableRow output) { // Compute TOP(N) per group // … }}

Recursive• Allow multi-stage aggregation trees• Requires same schema (input => output) • Requires associativity:• R(x, y) = R( R(x), R(y) )

• Default = non-recursive• User code has to honor recursive

semantics

www.bing.combrought to a single vertex

Page 46: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Tuning for Performance – User Defined Operators

Page 47: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

// Bing impressions@Impressions = SELECT * FROM searchDM.SML.PageView(@start, @end) AS PageView ;

// Compute sessions@Sessions = REDUCE @Impressions ON Client, Market READONLY Market USING new Bing.SessionReducer(range : 30) ;

// Users metrics@Metrics = SELECT * FROM @Sessions WHERE Market == "en-us" ;

// …

Microsoft Confidential

U-SQL OptimizationsPredicate pushing – UDO pass-through columns

Page 49: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

// Bing impressions@Impressions = SELECT * FROM searchDM.SML.PageView(@start, @end) AS PageView ;

// Compute page views@Impressions = PROCESS @Impressions READONLY Market PRODUCE Client, Market, Header string USING new Bing.HtmlProcessor() ;

@Sessions = REDUCE @Impressions ON Client, Market READONLY Market USING new Bing.SessionReducer(range : 30) ;

// Users metrics@Metrics = SELECT * FROM @Sessions WHERE Market == "en-us" ;

Microsoft Confidential

U-SQL OptimizationsPredicate pushing – UDO row level processors

public abstract class IProcessor : IUserDefinedOperator{ /// <summary/> public abstract IRow Process(IRow input, IUpdatableRow output);}

public abstract class IReducer : IUserDefinedOperator{ /// <summary/> public abstract IEnumerable<IRow> Reduce(IRowset input, IUpdatableRow output);}

Page 50: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

// Bing impressions@Impressions = SELECT Client, Market, Html FROM searchDM.SML.PageView(@start, @end) AS PageView ;

// Compute page views@Impressions = PROCESS @Impressions PRODUCE Client, Market, Header string USING new Bing.HtmlProcessor() ;

// Users metrics@Metrics = SELECT * FROM @Sessions WHERE Market == "en-us" && Header.Contains("microsoft.com") AND Header.Contains("microsoft.com") ;

U-SQL OptimizationsPredicate pushing – relational vs. C# semantics

Page 51: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

// Bing impressions@Impressions = SELECT * FROM searchDM.SML.PageView(@start, @end) AS PageView ;

// Compute page views@Impressions = PROCESS @Impressions PRODUCE * REQUIRED ClientId, HtmlContent(Header, Footer) USING new Bing.HtmlProcessor() ;

// Users metrics@Metrics = SELECT ClientId, Market, Header FROM @Sessions WHERE Market == "en-us" ;

U-SQL OptimizationsColumn Pruning and dependencies

C H M

C H M

C H M

Column Pruning• Minimize I/O (data shuffling)• Minimize CPU (complex processing, html)• Requires dependency knowledge:• R(D*) = Input ( Output )

• Default no pruning• User code has to honor reduced columns

A B C D E F G J KH I … M … 1000

Page 52: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

UDO Tips and Warnings

• Tips when Using UDOs:• READONLY clause to allow pushing predicates through

UDOs• REQUIRED clause to allow column pruning through UDOs• PRESORT on REDUCE if you need global order• Hint Cardinality if it does choose the wrong plan

• Warnings and better alternatives:• Use SELECT with UDFs instead of PROCESS• Use User-defined Aggregators instead of REDUCE• Learn to use Windowing Functions (OVER expression)

• Good use-cases for PROCESS/REDUCE/COMBINE:• The logic needs to dynamically access the input and/or

output schema. E.g., create a JSON doc for the data in the row where the columns are not known apriori.

• Your UDF based solution creates too much memory pressure and you can write your code more memory efficient in a UDO

• You need an ordered Aggregator or produce more than 1 row per group

Page 53: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Tuning for Performance – Data insertion

Page 54: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

INSERT Multiple INSERTs into same table

• Generates separate file per insert in physical storage:• Can lead to performance degradation

• Recommendations:• Try to avoid small inserts• Rebuild table after frequent insertions

with:ALTER TABLE T REBUILD;

Page 55: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

MICROSOFT CONFIDENT IAL – INTERNAL ONLY

Future ItemsGA and beyond • Tooling

• Resource planning based on $-cost

• Storage support• Storage compression (available since this

week!)• Columnar Storage/Index• Secondary Index

Page 56: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Additional Resources Blogs and community page:

• http://usql.io (U-SQL Github)• http://blogs.msdn.microsoft.com/mrys/• http://blogs.msdn.microsoft.com/azuredatalake/ • https://channel9.msdn.com/Search?term=U-SQL#ch9Search

Documentation and articles:• http://aka.ms/usql_reference• https://azure.microsoft.com/en-us/documentation/services/data-l

ake-analytics/• https://msdn.microsoft.com/en-us/magazine/mt614251

ADL forums and feedback• http://aka.ms/adlfeedback• https://social.msdn.microsoft.com/Forums/azure/en-US/home?for

um=AzureDataLake

• http://stackoverflow.com/questions/tagged/u-sql

Slide decks• http://www.Slideshare.net/MichaelRys

Page 58: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Session Evaluations

ways to access

Go to passSummit.com

Download the GuideBook App and search: PASS Summit 2016

Follow the QR code link displayed on session signage throughout the conference venue and in the program guide

Submit by 5pmFriday November 6th toWIN prizes

Your feedback is important and valuable. 3

Page 59: Tuning and Optimizing U-SQL Queries (SQLPASS 2016)

Thank You Learn more from

Michael [email protected] or follow

@MikeDoesBigData