Download - Teradata Query Optimization Guidelines
-
7/26/2019 Teradata Query Optimization Guidelines
1/13
-
7/26/2019 Teradata Query Optimization Guidelines
2/13
Introduction
Optimization is the technique of selecting the least expensive plan (fastest plan) forthe query to fetch results. The optimizer considers the possible query plans for agiven input query, and attempts to determine which of those plans will be the most
ecient.
Teradata performance tuning is a technique of improving the process in order forquery to perform faster with the minimal use of !"# resources.
The typical goal of an $%& optimization is to get the result (data set) with lesscomputing resources consumed and'or with shorter response time.
Query Optimization Process
The following processes list the logical sequence of the processes undertaen by theOptimizer as it optimizes a *& request. The processes that are listed here do not
include the in+uence of parameterized value peeing to determine whether the
Optimizer should generate a specic plan or a generic plan for a given request.
The input to the Optimizer is the %uery -ewrite -esTree. The Optimizer then
produces the optimized white tree, which it passes to an Optimizer subcomponent
called the /enerator.
The Optimizer engages in the following process stages.
1. -eceives the %uery -ewrite -esTree as input.2. "rocesses correlated sub queries by converting them to unnested $0&0!Ts or simple
1oins.3. "rocesses non2correlated subqueries by materializing the subquery and placing its
value in the #$34/ row for the query regardless of whether the subquery is on the
&5$ or the -5$ of the operator in the predicate.4. $earches for a relevant 1oin or hash index.5. *aterializes subqueries to spool les.6. 6nalyzes the materialized subqueries for optimization possibilities.
a. $eparates conditions from one another.b. "ushes down predicates.c. /enerates connection information.d. &ocates any complex 1oins.e. iscovers aggregations and opportunities for partial group by optimizations
7. /enerates size and content estimates of spool les required for further processing.8. /enerates an optimal single2table access path.9. $implies and optimizes any complex 1oins identied in stage 7d.10.*aps 1oin columns from a 1oin (spool) relation to the list of eld 3s from the input
base tables to prepare the relation for 1oin planning.
-
7/26/2019 Teradata Query Optimization Guidelines
3/13
11./enerates information about local connections. 6 connecting condition is one that
connects an outer query and a subquery. 6 direct connection exists between two
tables if either of the following conditions is found.
64ed bind term8 miscellaneous terms such as inequalities, 64s, and O-s9 cross,outer, or minus 1oin term that satises the dependent information between the twotables6 spool le of an uncorrelated subquery 0:3$T predicate that connects with anyouter table
12./enerates information about indexes that might be used in 1oin planning, including
the primary indexes for the relevant tables and pointers to the table descriptors of
any other useful indexes.13."erforms row and column partition elimination for partitioned tables.14.#ses a recursive greedy ;2table looahead algorithm to generate the best 1oin plan.15.3f the 1oin plan identied in step;< does not meet the heuristics2based criteria for
an adequate 1oin plan, generate another best 1oin plan using an n2table looahead
algorithm.16.$elects the better 1oin plan of the two plans generated in steps ;.19."asses the optimized white tree to the /enerator.
The /enerator then generates plastic steps for the plan chosen in step ;?.
MethodologiesOptimization is one the most taled about technique in today@s time for Teradata.
Aecause of the huge amount of data in Teradata database, it becomes very
important to tae out the optimized performance from it, otherwise the queries will
perform poorly and the meaning of parallelism will be lost.
3n order to select the least expensive plan for the query to fetch results, mentioned
techniques or practices can be followed8
(1) STATISTICS
!ollecting statistics is one of the most primary steps in Teradata query Optimization.
$tatistics collection is essential for the optimal performance of the Teradata query
optimizer. The query optimizer relies on statistics to help it determine the best way
to access data. $tatistics also help the optimizer ascertain how many rows exist in
tables being queried and predict how many rows will qualify for given conditions.
-
7/26/2019 Teradata Query Optimization Guidelines
4/13
&ac of statistics, or out2dated statistics, might result in the optimizer choosing a
less2than2optimal method for accessing data tables.
6lso, statistics help Teradata determine the spool le size needed to contain the
resulting data. 6ccurate statistics could mae the diBerence between a successful
query and a query that runs out of spool space.Syntax8
To chec whether the $tatistics dened for the table85elp stats tableCname9
To collect or refresh the statistics8!ollect stats on tableCname Dindex'columnE(colCname, colCname, F)9
DIAG!STIC STAT"#"T36/4O$T3! 50&"$T6T$ O4 GO- $0$$3O4
The above statement can be used to determine the stats that might be
required to improve the performance of the $%&. The 0:"&634 plan needs to be
executed following the above statement to nd the stats suggestion.
$tats will qualify one of the below condence levels8
;) 4o !ondence 2 no statistics dened for a table.H) &ow !ondence 2 $tats are dicult to use precisely.I) 5igh !ondence 2 Optimizer is sure of results based on the stats available.
$tatistics need to be collected for8
;. 6ll non2unique indexes.H. #"3 of small tables (tables with less than x rows per 6*", depends on
6vailable number of 6*"s)I. 6ll indexes of a 1oin index
-
7/26/2019 Teradata Query Optimization Guidelines
5/13
6lways collect statistics at the column level even when collecting on an index. This
is because indexes can be dropped at any time, so they are often dropped and
recreated.
Jhen to collect $tatistics8
6fter the following8
;. Gast loadsH. *ulti loadsI. 4on2utility (T"ump'AT0%'OA!'NA!) !ollect statistics after a signicant
percentage of data values have changed.
-
7/26/2019 Teradata Query Optimization Guidelines
6/13
$0&0!T *.0*"C3,*.0*"C46*0, K4. 0"TC0$!, !O#4T(S) G-O*0*"&O00 *, 0"6-T*04T K4J50-0 *.0"TC3 K4. 0"TC364 K4.&O! U#$6U64 *.0"TC3 34
(UHLV>U,UIL?U,UU,UV7=@,@HV7
-
7/26/2019 Teradata Query Optimization Guidelines
7/13
Jhile 1oining two tables mae sure that both the columns fall under the same
character set. Otherwise implicit conversion of one to the other taes place resulting
in poor performance.
(7) DAT" C!#+AIS!
Jhen comparing values of date in a particular range, the query may result inproduct 1oin.
This can be avoided with the usage of $$C!6&046-.!6&046-, which is
TeradataUs in2built database.
0xample8
3nsert into tableCa select tH.a;,tH.aH,tH.aI,tH.a< from
tableCH tH 1oin tableCI tI on tH.a;tI.a; and tH.a=CdtRtI.a
-
7/26/2019 Teradata Query Optimization Guidelines
8/13
tH.a;, tH.aH, tH.aI, tH.a< from
tableCH tH 1oin tableCI tI on tH.a;tI.a; 1oin tableC< t