teradata query optimization guidelines

7/26/2019 Teradata Query Optimization Guidelines

1/13


2/13

Introduction

Optimization is the technique of selecting the least expensive plan (fastest plan) forthe query to fetch results. The optimizer considers the possible query plans for agiven input query, and attempts to determine which of those plans will be the most

ecient.

Teradata performance tuning is a technique of improving the process in order forquery to perform faster with the minimal use of !"# resources.

The typical goal of an $%& optimization is to get the result (data set) with lesscomputing resources consumed and'or with shorter response time.

Query Optimization Process

The following processes list the logical sequence of the processes undertaen by theOptimizer as it optimizes a *& request. The processes that are listed here do not

include the in+uence of parameterized value peeing to determine whether the

Optimizer should generate a specic plan or a generic plan for a given request.

The input to the Optimizer is the %uery -ewrite -esTree. The Optimizer then

produces the optimized white tree, which it passes to an Optimizer subcomponent

called the /enerator.

The Optimizer engages in the following process stages.

1. -eceives the %uery -ewrite -esTree as input.2. "rocesses correlated sub queries by converting them to unnested $0&0!Ts or simple

1oins.3. "rocesses non2correlated subqueries by materializing the subquery and placing its

value in the #$34/ row for the query regardless of whether the subquery is on the

&5$ or the -5$ of the operator in the predicate.4. $earches for a relevant 1oin or hash index.5. *aterializes subqueries to spool les.6. 6nalyzes the materialized subqueries for optimization possibilities.

a. $eparates conditions from one another.b. "ushes down predicates.c. /enerates connection information.d. &ocates any complex 1oins.e. iscovers aggregations and opportunities for partial group by optimizations

7. /enerates size and content estimates of spool les required for further processing.8. /enerates an optimal single2table access path.9. $implies and optimizes any complex 1oins identied in stage 7d.10.*aps 1oin columns from a 1oin (spool) relation to the list of eld 3s from the input

base tables to prepare the relation for 1oin planning.


3/13

11./enerates information about local connections. 6 connecting condition is one that

connects an outer query and a subquery. 6 direct connection exists between two

tables if either of the following conditions is found.

64ed bind term8 miscellaneous terms such as inequalities, 64s, and O-s9 cross,outer, or minus 1oin term that satises the dependent information between the twotables6 spool le of an uncorrelated subquery 0:3$T predicate that connects with anyouter table

12./enerates information about indexes that might be used in 1oin planning, including

the primary indexes for the relevant tables and pointers to the table descriptors of

any other useful indexes.13."erforms row and column partition elimination for partitioned tables.14.#ses a recursive greedy ;2table looahead algorithm to generate the best 1oin plan.15.3f the 1oin plan identied in step;< does not meet the heuristics2based criteria for

an adequate 1oin plan, generate another best 1oin plan using an n2table looahead

algorithm.16.$elects the better 1oin plan of the two plans generated in steps ;.19."asses the optimized white tree to the /enerator.

The /enerator then generates plastic steps for the plan chosen in step ;?.

MethodologiesOptimization is one the most taled about technique in today@s time for Teradata.

Aecause of the huge amount of data in Teradata database, it becomes very

important to tae out the optimized performance from it, otherwise the queries will

perform poorly and the meaning of parallelism will be lost.

3n order to select the least expensive plan for the query to fetch results, mentioned

techniques or practices can be followed8

(1) STATISTICS

!ollecting statistics is one of the most primary steps in Teradata query Optimization.

$tatistics collection is essential for the optimal performance of the Teradata query

optimizer. The query optimizer relies on statistics to help it determine the best way

to access data. $tatistics also help the optimizer ascertain how many rows exist in

tables being queried and predict how many rows will qualify for given conditions.


4/13

&ac of statistics, or out2dated statistics, might result in the optimizer choosing a

less2than2optimal method for accessing data tables.

6lso, statistics help Teradata determine the spool le size needed to contain the

resulting data. 6ccurate statistics could mae the diBerence between a successful

query and a query that runs out of spool space.Syntax8

To chec whether the $tatistics dened for the table85elp stats tableCname9

To collect or refresh the statistics8!ollect stats on tableCname Dindex'columnE(colCname, colCname, F)9

DIAG!STIC STAT"#"T36/4O$T3! 50&"$T6T$ O4 GO- $0$$3O4

The above statement can be used to determine the stats that might be

required to improve the performance of the $%&. The 0:"&634 plan needs to be

executed following the above statement to nd the stats suggestion.

$tats will qualify one of the below condence levels8

;) 4o !ondence 2 no statistics dened for a table.H) &ow !ondence 2 $tats are dicult to use precisely.I) 5igh !ondence 2 Optimizer is sure of results based on the stats available.

$tatistics need to be collected for8

;. 6ll non2unique indexes.H. #"3 of small tables (tables with less than x rows per 6*", depends on

6vailable number of 6*"s)I. 6ll indexes of a 1oin index


5/13

6lways collect statistics at the column level even when collecting on an index. This

is because indexes can be dropped at any time, so they are often dropped and

recreated.

Jhen to collect $tatistics8

6fter the following8

;. Gast loadsH. *ulti loadsI. 4on2utility (T"ump'AT0%'OA!'NA!) !ollect statistics after a signicant

percentage of data values have changed.


6/13

$0&0!T *.0*"C3,*.0*"C46*0, K4. 0"TC0$!, !O#4T(S) G-O*0*"&O00 *, 0"6-T*04T K4J50-0 *.0"TC3 K4. 0"TC364 K4.&O! U#$6U64 *.0"TC3 34

(UHLV>U,UIL?U,UU,UV7=@,@HV7


7/13

Jhile 1oining two tables mae sure that both the columns fall under the same

character set. Otherwise implicit conversion of one to the other taes place resulting

in poor performance.

(7) DAT" C!#+AIS!

Jhen comparing values of date in a particular range, the query may result inproduct 1oin.

This can be avoided with the usage of $$C!6&046-.!6&046-, which is

TeradataUs in2built database.

0xample8

3nsert into tableCa select tH.a;,tH.aH,tH.aI,tH.a< from

tableCH tH 1oin tableCI tI on tH.a;tI.a; and tH.a=CdtRtI.a


8/13

tH.a;, tH.aH, tH.aI, tH.a< from

tableCH tH 1oin tableCI tI on tH.a;tI.a; 1oin tableC< t

teradata query optimization guidelines

Documents