t he q uery c ompiler prepared by : ankit patel (226)

15
THE QUERY COMPILER Prepared by : Ankit Patel (226)

Post on 22-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

THE QUERY COMPILERPrepared by :

Ankit Patel (226)

Page 2: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

REFERENCES

H. Garcia-Molina, J. Ullman, and J. Widom, “Database System: The Complete Book,” second edition: p.897-913, Prentice Hall, New Jersey, 2008

Page 3: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

COMPILATION OF QUERIES

Compilation means turning a query into a physical query plan, which can be implemented by query engine.

Steps of query compilation : Parsing Semantic checking Selection of the preferred logical query plan Generating the best physical plan

Page 4: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

THE PARSER

The first step of SQL query processing. Generates a parse tree Nodes in the parse tree corresponds to the SQL

constructs Similar to the compiler of a programming language

Page 5: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

VIEW EXPANSION

A very critical part of query compilation. Expands the view references in the query tree to the

actual view. This introduces several opportunities to optimize the

complete query..

Page 6: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

SEMANTIC CHECKING

Checks the semantics of a SQL query. Examines a parse tree. Checks :

Attributes Relation names Types

Resolves attribute references.

Page 7: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

CONVERSION TO A LOGICAL QUERY PLAN

Converts a semantically parsed tree to a algebraic expression.

Conversion is straightforward but subqueries need to be optimized.

One approach is to introduce a two-argument selection that puts the subquery in the condition of the selection, and then apply appropriate transformations for the common special cases.

Page 8: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

ALGEBRAIC TRANSFORMATION

Many different ways to transform a logical query plan to an actual plan using algebraic transformations.

The laws used for this transformation : Commutative and associative laws Laws involving selection Pushing selection Laws involving projection Laws about joins and products Laws involving duplicate eliminations Laws involving grouping and aggregation

Page 9: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

ESTIMATING SIZES OF RELATIONS

True running time is taken into consideration when selecting the best logical plan.

Two factors the affects the most in estimating the sizes of relation : Size of relations ( No. of tuples ) No. of distinct values for each attribute of each relation

Histograms are used by some systems.

Page 10: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

COST BASED OPTIMIZING

Best physical query plan represents the least costly plan. Factors that decide the cost of a query plan :

Order and grouping operations like joins,unions and intersections.

Nested loop and the hash loop joins used. Scanning and sorting operations. Storing intermediate results.

Page 11: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

Histograms:

Some system keep histograms of the values for a given attribute.

This information can be used to obtain better estimates of intermediate relation sizes than the simple methods.

Page 12: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

PLAN ENUMERATION STRATEGIES

Common approaches for searching the space for best physical plan . Dynamic programming : Tabularizing the best plan for each

sub expression Selinger style programming : sort-order the results as a part

of table Greedy approaches : Making a series of locally optimal

decisions Branch-and-bound : Starts with enumerating the worst plans

and reach the best plan

Page 13: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

LEFT-DEEP JOIN TREES

Left – Deep Join Trees are the binary trees with a single spine down the left edge and with leaves as right children.

This strategy reduces the number of plans to be considered for the best physical plan.

Restrict the search to Left – Deep Join Trees when picking a grouping and order for the join of several relations.

Page 14: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

PHYSICAL PLANS FOR SELECTION

Breaking a selection into an index-scan of relation, followed by a filter operation.

The filter then examines the tuples retrieved by the index-scan.

Allows only those to pass which meet the portions of selection condition.

Page 15: T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)

PIPELINING VERSUS MATERIALIZING

An operator always consumes the result of other operator and is passed through the main memory.

This flow of data between the operators can be controlled to implement “ Pipelining “ .

The intermediate results should be removed from main memory to save space for other operators.

This techniques can implemented using “ materialization “ .

Both the pipelining and the materialization should be considered by the physical query plan generator.