1 | /* |
2 | Copyright (c) 2017 MariaDB |
3 | |
4 | This program is free software; you can redistribute it and/or modify |
5 | it under the terms of the GNU General Public License as published by |
6 | the Free Software Foundation; version 2 of the License. |
7 | |
8 | This program is distributed in the hope that it will be useful, |
9 | but WITHOUT ANY WARRANTY; without even the implied warranty of |
10 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
11 | GNU General Public License for more details. |
12 | |
13 | You should have received a copy of the GNU General Public License |
14 | along with this program; if not, write to the Free Software |
15 | Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ |
16 | |
17 | /* |
18 | This file contains functions to support the splitting technique. |
19 | This optimization technique can be applied to equi-joins involving |
20 | materialized tables such as materialized views, materialized derived tables |
21 | and materialized CTEs. The technique also could be applied to materialized |
22 | semi-joins though the code below does not support this usage yet. |
23 | |
24 | Here are the main ideas behind this technique that we'll call SM optimization |
25 | (SplitMaterialization). |
26 | |
27 | Consider the query |
28 | SELECT t1.a, t.min |
29 | FROM t1, (SELECT t2.a, MIN(t2.b) as min FROM t2 GROUP BY t2.a) t |
30 | WHERE t1.a = t.a and t1.b < const |
31 | |
32 | Re-write the query into |
33 | SELECT t1.a, t.min |
34 | FROM t1, LATERAL (SELECT t2.a, MIN(t2.b) as min |
35 | FROM t2 WHERE t2.a = t1.a GROUP BY t2.a) t |
36 | WHERE t1.b < const |
37 | |
38 | The execution of the original query (Q1) does the following: |
39 | 1. Executes the query in the specification of the derived table |
40 | and puts the result set into a temporary table with an index |
41 | on the first column. |
42 | 2. Joins t1 with the temporary table using the its index. |
43 | |
44 | The execution of the transformed query (Q1R) follows these steps: |
45 | 1. For each row of t1 where t1.b < const a temporary table |
46 | containing all rows of of t2 with t2.a = t1.a is created |
47 | 2. If there are any rows in the temporary table aggregation |
48 | is performed for them |
49 | 3. The result of the aggregation is joined with t1. |
50 | |
51 | The second execution can win if: |
52 | a) There is an efficient way to select rows of t2 for which t2.a = t1.a |
53 | (For example if there is an index on t2.a) |
54 | and |
55 | b) The number of temporary tables created for partitions |
56 | is much smaller that the total number of partitions |
57 | |
58 | It should be noted that for the transformed query aggregation |
59 | for a partition may be performed several times. |
60 | |
61 | As we can see the optimization basically splits table t2 into |
62 | partitions and performs aggregation over each of them |
63 | independently. |
64 | |
65 | If we have only one equi-join condition then we either push it as |
66 | for Q1R or we don't. In a general case we may have much more options. |
67 | Consider the query (Q3) |
68 | SELECT |
69 | FROM t1,t2 (SELECT t3.a, t3.b, MIN(t3.c) as min |
70 | FROM t3 GROUP BY a,b) t |
71 | WHERE t.a = t1.a AND t.b = t2.b |
72 | AND t1.c < c1 and t2.c < c2 |
73 | AND P(t1,t2); |
74 | (P(t1,t2) designates some additional conditions over columns of t1,t2). |
75 | |
76 | Assuming that there indexes on t3(a,b) and t3(b) here we have several |
77 | reasonable options to push equi-join conditions into the derived. |
78 | All these options should be taken into account when the optimizer |
79 | evaluates different join orders. When the join order (t1,t,t2) is |
80 | evaluated there is only one way of splitting : to push the condition |
81 | t.a = t1.a into t. With the join order (t2,t,t1) only the condition |
82 | t.b = t2.b can be pushed. When the join orders (t1,t2,t) and (t2,t1,t) |
83 | are evaluated then the optimizer should consider pushing t.a = t1.a, |
84 | t.b = t2.b and (t.a = t1.a AND t.b = t2.b) to choose the best condition |
85 | for splitting. Apparently here last condition is the best one because |
86 | it provides the miximum possible number of partitions. |
87 | |
88 | If we dropped the index on t3(a,b) and created the index on t3(a) instead |
89 | then we would have two options for splitting: to push t.a = t1.a or to |
90 | push t.b = t2.b. If the selectivity of the index t3(a) is better than |
91 | the selectivity of t3(b) then the first option is preferred. |
92 | |
93 | Although the condition (t.a = t1.a AND t.b = t2.b) provides a better |
94 | splitting than the condition t.a = t1.a the latter will be used for |
95 | splitting if the execution plan with the join order (t1,t,t2) turns out |
96 | to be the cheapest one. It's quite possible when the join condition |
97 | P(t1,t2) has a bad selectivity. |
98 | |
99 | Whenever the optimizer evaluates the cost of using a splitting it |
100 | compares it with the cost of materialization without splitting. |
101 | |
102 | If we just drop the index on t3(a,b) the chances that the splitting |
103 | will be used becomes much lower but they still exists providing that |
104 | the fanout of the partial join of t1 and t2 is small enough. |
105 | */ |
106 | |
107 | /* |
108 | Splitting can be applied to a materialized table specified by the query |
109 | with post-join operations that require partitioning of the result set produced |
110 | by the join expression used in the FROM clause the query such as GROUP BY |
111 | operation and window function operation. In any of these cases the post-join |
112 | operation can be executed independently for any partition only over the rows |
113 | of this partition. Also if the set of all partitions is divided into disjoint |
114 | subsets the operation can applied to each subset independently. In this case |
115 | all rows are first partitioned into the groups each of which contains all the |
116 | rows from the partitions belonging the same subset and then each group |
117 | is subpartitioned into groups in the the post join operation. |
118 | |
119 | The set of all rows belonging to the union of several partitions is called |
120 | here superpartition. If a grouping operation is defined by the list |
121 | e_1,...,e_n then any set S = {e_i1,...,e_ik} can be used to devide all rows |
122 | into superpartions such that for any two rows r1, r2 the following holds: |
123 | e_ij(r1) = e_ij(r2) for each e_ij from S. We use the splitting technique |
124 | only if S consists of references to colums of the joined tables. |
125 | For example if the GROUP BY list looks like this a, g(b), c we can consider |
126 | applying the splitting technique to the superpartitions defined by {a,c}, |
127 | {a}, {c} (a and c here may be the references to the columns from different |
128 | tables). |
129 | */ |
130 | |
131 | /* |
132 | The following describes when and how the optimizer decides whether it |
133 | makes sense to employ the splitting technique. |
134 | |
135 | 1. For each instance of a materialized table (derived/view/CTE) it is |
136 | checked that it is potentially splittable. Now it is done right after the |
137 | execution plan for the select specifying this table has been chosen. |
138 | |
139 | 2. Any potentially splittable materialized table T is subject to two-phase |
140 | optimization. It means that the optimizer first builds the best execution |
141 | plan for join that specifies T. Then the control is passed back to the |
142 | optimization process of the embedding select Q. After the execution plan |
143 | for Q has been chosen the optimizer finishes the optimization of the join |
144 | specifying T. |
145 | |
146 | 3. When the optimizer builds the container with the KEYUSE structures |
147 | for the join of embedding select it detects the equi-join conditions |
148 | PC that potentially could be pushed into a potentially splittable |
149 | materialized table T. The collected information about such conditions |
150 | is stored together with other facts on potential splittings for table T. |
151 | |
152 | 4. When the optimizer starts looking for the best execution plan for the |
153 | embedding select Q for each potentially splittable materialized table T |
154 | it creates special KEYUSE structures for pushable equi-join conditions |
155 | PC. These structures are used to add new elements to the container |
156 | of KEYUSE structures built for T. The specifics of these elements is |
157 | that they can be ebabled and disabled during the process of choosing |
158 | the best plan for Q. |
159 | |
160 | 5. When the optimizer extends a partial join order with a potentially |
161 | splittable materialized table T (in function best_access_path) it |
162 | first evaluates a new execution plan for the modified specification |
163 | of T that adds all equi-join conditions that can be pushed with |
164 | current join prefix to the WHERE conditions of the original |
165 | specification of T. If the cost of the new plan is better than the |
166 | the cost of the original materialized table then the optimizer |
167 | prefers to use splitting for the current join prefix. As the cost |
168 | of the plan depends only on the pushed conditions it makes sense |
169 | to cache this plan for other prefixes. |
170 | |
171 | 6. The optimizer takes into account the cost of splitting / materialization |
172 | of a potentially splittable materialized table T as a startup cost |
173 | to access table T. |
174 | |
175 | 7. When the optimizer finally chooses the best execution plan for |
176 | the embedding select Q and this plan prefers using splitting |
177 | for table T with pushed equi-join conditions PC then the execution |
178 | plan for the underlying join with these conditions is chosen for T. |
179 | */ |
180 | |
181 | /* |
182 | The implementation of the splitting technique below allows to apply |
183 | the technique only to a materialized derived table / view / CTE whose |
184 | specification is either a select with GROUP BY or a non-grouping select |
185 | with window functions that share the same PARTITION BY list. |
186 | */ |
187 | |
188 | #include "mariadb.h" |
189 | #include "sql_select.h" |
190 | |
191 | /* Info on a splitting field */ |
192 | struct SplM_field_info |
193 | { |
194 | /* Splitting field in the materialized table T */ |
195 | Field *mat_field; |
196 | /* The item from the select list of the specification of T */ |
197 | Item *producing_item; |
198 | /* The corresponding splitting field from the specification of T */ |
199 | Field *underlying_field; |
200 | }; |
201 | |
202 | |
203 | /* Info on the splitting execution plan saved in SplM_opt_info::cache */ |
204 | struct SplM_plan_info |
205 | { |
206 | /* The cached splitting execution plan P */ |
207 | struct st_position *best_positions; |
208 | /* The cost of the above plan */ |
209 | double cost; |
210 | /* Selectivity of splitting used in P */ |
211 | double split_sel; |
212 | /* For fast search of KEYUSE_EXT elements used for splitting in P */ |
213 | struct KEYUSE_EXT *keyuse_ext_start; |
214 | /* The tables that contains the fields used for splitting in P */ |
215 | TABLE *table; |
216 | /* The number of the key from 'table' used for splitting in P */ |
217 | uint key; |
218 | /* Number of the components of 'key' used for splitting in P */ |
219 | uint parts; |
220 | }; |
221 | |
222 | |
223 | /* |
224 | The structure contains the information that is used by the optimizer |
225 | for potentially splittable materialization of T that is a materialized |
226 | derived_table / view / CTE |
227 | */ |
228 | class SplM_opt_info : public Sql_alloc |
229 | { |
230 | public: |
231 | /* The join for the select specifying T */ |
232 | JOIN *join; |
233 | /* The map of tables from 'join' whose columns can be used for partitioning */ |
234 | table_map tables_usable_for_splitting; |
235 | /* Info about the fields of the joined tables usable for splitting */ |
236 | SplM_field_info *spl_fields; |
237 | /* The number of elements in the above list */ |
238 | uint spl_field_cnt; |
239 | /* Contains the structures to generate all KEYUSEs for pushable equalities */ |
240 | List<KEY_FIELD> added_key_fields; |
241 | /* The cache of evaluated execution plans for 'join' with pushed equalities */ |
242 | List<SplM_plan_info> plan_cache; |
243 | /* Cost of best execution plan for join when nothing is pushed */ |
244 | double unsplit_cost; |
245 | /* Cardinality of T when nothing is pushed */ |
246 | double unsplit_card; |
247 | /* Lastly evaluated execution plan for 'join' with pushed equalities */ |
248 | SplM_plan_info *last_plan; |
249 | |
250 | SplM_plan_info *find_plan(TABLE *table, uint key, uint parts); |
251 | }; |
252 | |
253 | |
254 | void TABLE::set_spl_opt_info(SplM_opt_info *spl_info) |
255 | { |
256 | if (spl_info) |
257 | spl_info->join->spl_opt_info= spl_info; |
258 | spl_opt_info= spl_info; |
259 | } |
260 | |
261 | |
262 | void TABLE::deny_splitting() |
263 | { |
264 | DBUG_ASSERT(spl_opt_info != NULL); |
265 | spl_opt_info->join->spl_opt_info= NULL; |
266 | spl_opt_info= NULL; |
267 | } |
268 | |
269 | |
270 | /* This structure is auxiliary and used only in the function that follows it */ |
271 | struct SplM_field_ext_info: public SplM_field_info |
272 | { |
273 | uint item_no; |
274 | bool is_usable_for_ref_access; |
275 | }; |
276 | |
277 | |
278 | /** |
279 | @brief |
280 | Check whether this join is one for potentially splittable materialized table |
281 | |
282 | @details |
283 | The function checks whether this join is for select that specifies |
284 | a potentially splittable materialized table T. If so, the collected |
285 | info on potential splittability of T is attached to the field spl_opt_info |
286 | of the TABLE structure for T. |
287 | |
288 | The function returns a positive answer if the following holds: |
289 | 1. the optimizer switch 'split_materialized' is set 'on' |
290 | 2. the select owning this join specifies a materialized derived/view/cte T |
291 | 3. this is the only select in the specification of T |
292 | 4. condition pushdown is not prohibited into T |
293 | 5. T is not recursive |
294 | 6. not all of this join are constant or optimized away |
295 | 7. T is either |
296 | 7.1. a grouping table with GROUP BY list P |
297 | or |
298 | 7.2. a non-grouping table with window functions over the same non-empty |
299 | partition specified by the PARTITION BY list P |
300 | 8. P contains some references on the columns of the joined tables C |
301 | occurred also in the select list of this join |
302 | 9. There are defined some keys usable for ref access of fields from C |
303 | with available statistics. |
304 | |
305 | @retval |
306 | true if the answer is positive |
307 | false otherwise |
308 | */ |
309 | |
310 | bool JOIN::check_for_splittable_materialized() |
311 | { |
312 | ORDER *partition_list= 0; |
313 | st_select_lex_unit *unit= select_lex->master_unit(); |
314 | TABLE_LIST *derived= unit->derived; |
315 | if (!(optimizer_flag(thd, OPTIMIZER_SWITCH_SPLIT_MATERIALIZED)) || // !(1) |
316 | !(derived && derived->is_materialized_derived()) || // !(2) |
317 | (unit->first_select()->next_select()) || // !(3) |
318 | (derived->prohibit_cond_pushdown) || // !(4) |
319 | (derived->is_recursive_with_table()) || // !(5) |
320 | (table_count == 0 || const_tables == top_join_tab_count)) // !(6) |
321 | return false; |
322 | if (group_list) // (7.1) |
323 | { |
324 | if (!select_lex->have_window_funcs()) |
325 | partition_list= group_list; |
326 | } |
327 | else if (select_lex->have_window_funcs() && |
328 | select_lex->window_specs.elements == 1) // (7.2) |
329 | { |
330 | partition_list= |
331 | select_lex->window_specs.head()->partition_list->first; |
332 | } |
333 | if (!partition_list) |
334 | return false; |
335 | |
336 | ORDER *ord; |
337 | Dynamic_array<SplM_field_ext_info> candidates; |
338 | |
339 | /* |
340 | Select from partition_list all candidates for splitting. |
341 | A candidate must be |
342 | - field item or refer to such (8.1) |
343 | - item mentioned in the select list (8.2) |
344 | Put info about such candidates into the array candidates |
345 | */ |
346 | table_map usable_tables= 0; // tables that contains the candidate |
347 | for (ord= partition_list; ord; ord= ord->next) |
348 | { |
349 | Item *ord_item= *ord->item; |
350 | if (ord_item->real_item()->type() != Item::FIELD_ITEM) // !(8.1) |
351 | continue; |
352 | |
353 | Field *ord_field= ((Item_field *) (ord_item->real_item()))->field; |
354 | |
355 | /* Ignore fields from of inner tables of outer joins */ |
356 | TABLE_LIST *tbl= ord_field->table->pos_in_table_list; |
357 | if (tbl->is_inner_table_of_outer_join()) |
358 | continue; |
359 | |
360 | List_iterator<Item> li(fields_list); |
361 | Item *item; |
362 | uint item_no= 0; |
363 | while ((item= li++)) |
364 | { |
365 | if ((*ord->item)->eq(item, 0)) // (8.2) |
366 | { |
367 | SplM_field_ext_info new_elem; |
368 | new_elem.producing_item= item; |
369 | new_elem.item_no= item_no; |
370 | new_elem.mat_field= derived->table->field[item_no]; |
371 | new_elem.underlying_field= ord_field; |
372 | new_elem.is_usable_for_ref_access= false; |
373 | candidates.push(new_elem); |
374 | usable_tables|= ord_field->table->map; |
375 | break; |
376 | } |
377 | item_no++; |
378 | } |
379 | } |
380 | if (candidates.elements() == 0) // no candidates satisfying (8.1) && (8.2) |
381 | return false; |
382 | |
383 | /* |
384 | For each table from this join find the keys that can be used for ref access |
385 | of the fields mentioned in the 'array candidates' |
386 | */ |
387 | |
388 | SplM_field_ext_info *cand; |
389 | SplM_field_ext_info *cand_start= &candidates.at(0); |
390 | SplM_field_ext_info *cand_end= cand_start + candidates.elements(); |
391 | |
392 | for (JOIN_TAB *tab= join_tab; |
393 | tab < join_tab + top_join_tab_count; tab++) |
394 | { |
395 | TABLE *table= tab->table; |
396 | if (!(table->map & usable_tables)) |
397 | continue; |
398 | |
399 | table->keys_usable_for_splitting.clear_all(); |
400 | uint i; |
401 | for (i= 0; i < table->s->keys; i++) |
402 | { |
403 | if (!table->keys_in_use_for_query.is_set(i)) |
404 | continue; |
405 | KEY *key_info= table->key_info + i; |
406 | uint key_parts= table->actual_n_key_parts(key_info); |
407 | uint usable_kp_cnt= 0; |
408 | for ( ; usable_kp_cnt < key_parts; usable_kp_cnt++) |
409 | { |
410 | if (key_info->actual_rec_per_key(usable_kp_cnt) == 0) |
411 | break; |
412 | int fldnr= key_info->key_part[usable_kp_cnt].fieldnr; |
413 | |
414 | for (cand= cand_start; cand < cand_end; cand++) |
415 | { |
416 | if (cand->underlying_field->field_index + 1 == fldnr) |
417 | { |
418 | cand->is_usable_for_ref_access= true; |
419 | break; |
420 | } |
421 | } |
422 | if (cand == cand_end) |
423 | break; |
424 | } |
425 | if (usable_kp_cnt) |
426 | table->keys_usable_for_splitting.set_bit(i); |
427 | } |
428 | } |
429 | |
430 | /* Count the candidate fields that can be accessed by ref */ |
431 | uint spl_field_cnt= (uint)candidates.elements(); |
432 | for (cand= cand_start; cand < cand_end; cand++) |
433 | { |
434 | if (!cand->is_usable_for_ref_access) |
435 | spl_field_cnt--; |
436 | } |
437 | |
438 | if (!spl_field_cnt) // No candidate field can be accessed by ref => !(9) |
439 | return false; |
440 | |
441 | /* |
442 | Create a structure of the type SplM_opt_info and fill it with |
443 | the collected info on potential splittability of T |
444 | */ |
445 | SplM_opt_info *spl_opt_info= new (thd->mem_root) SplM_opt_info(); |
446 | SplM_field_info *spl_field= |
447 | (SplM_field_info *) (thd->calloc(sizeof(SplM_field_info) * |
448 | spl_field_cnt)); |
449 | |
450 | if (!(spl_opt_info && spl_field)) // consider T as not good for splitting |
451 | return false; |
452 | |
453 | spl_opt_info->join= this; |
454 | spl_opt_info->tables_usable_for_splitting= 0; |
455 | spl_opt_info->spl_field_cnt= spl_field_cnt; |
456 | spl_opt_info->spl_fields= spl_field; |
457 | for (cand= cand_start; cand < cand_end; cand++) |
458 | { |
459 | if (!cand->is_usable_for_ref_access) |
460 | continue; |
461 | spl_field->producing_item= cand->producing_item; |
462 | spl_field->underlying_field= cand->underlying_field; |
463 | spl_field->mat_field= cand->mat_field; |
464 | spl_opt_info->tables_usable_for_splitting|= |
465 | cand->underlying_field->table->map; |
466 | spl_field++; |
467 | } |
468 | |
469 | /* Attach this info to the table T */ |
470 | derived->table->set_spl_opt_info(spl_opt_info); |
471 | |
472 | return true; |
473 | } |
474 | |
475 | |
476 | /** |
477 | @brief |
478 | Collect info on KEY_FIELD usable for splitting |
479 | |
480 | @param |
481 | key_field KEY_FIELD to collect info on |
482 | |
483 | @details |
484 | The function assumes that this table is potentially splittable. |
485 | The function checks whether the KEY_FIELD structure key_field built for |
486 | this table was created for a splitting field f. If so, the function does |
487 | the following using info from key_field: |
488 | 1. Builds an equality of the form f = key_field->val that could be |
489 | pushed into this table. |
490 | 2. Creates a new KEY_FIELD structure for this equality and stores |
491 | a reference to this structure in this->spl_opt_info. |
492 | */ |
493 | |
494 | void TABLE::add_splitting_info_for_key_field(KEY_FIELD *key_field) |
495 | { |
496 | DBUG_ASSERT(spl_opt_info != NULL); |
497 | JOIN *join= spl_opt_info->join; |
498 | Field *field= key_field->field; |
499 | SplM_field_info *spl_field= spl_opt_info->spl_fields; |
500 | uint i= spl_opt_info->spl_field_cnt; |
501 | for ( ; i; i--, spl_field++) |
502 | { |
503 | if (spl_field->mat_field == field) |
504 | break; |
505 | } |
506 | if (!i) // field is not usable for splitting |
507 | return; |
508 | |
509 | /* |
510 | Any equality condition that can be potentially pushed into the |
511 | materialized derived table is constructed now though later it may turn out |
512 | that it is not needed, because it is not used for splitting. |
513 | The reason for this is that the failure to construct it when it has to be |
514 | injected causes denial for further processing of the query. |
515 | Formally this equality is needed in the KEY_FIELD structure constructed |
516 | here that will be used to generate additional keyuses usable for splitting. |
517 | However key_field.cond could be used for this purpose (see implementations |
518 | of virtual function can_optimize_keypart_ref()). |
519 | |
520 | The condition is built in such a form that it can be added to the WHERE |
521 | condition of the select that specifies this table. |
522 | */ |
523 | THD *thd= in_use; |
524 | Item *left_item= spl_field->producing_item->build_clone(thd); |
525 | Item *right_item= key_field->val->build_clone(thd); |
526 | Item_func_eq *eq_item= 0; |
527 | if (left_item && right_item) |
528 | { |
529 | right_item->walk(&Item::set_fields_as_dependent_processor, |
530 | false, join->select_lex); |
531 | right_item->update_used_tables(); |
532 | eq_item= new (thd->mem_root) Item_func_eq(thd, left_item, right_item); |
533 | } |
534 | if (!eq_item) |
535 | return; |
536 | KEY_FIELD *added_key_field= |
537 | (KEY_FIELD *) thd->alloc(sizeof(KEY_FIELD)); |
538 | if (!added_key_field || |
539 | spl_opt_info->added_key_fields.push_back(added_key_field,thd->mem_root)) |
540 | return; |
541 | added_key_field->field= spl_field->underlying_field; |
542 | added_key_field->cond= eq_item; |
543 | added_key_field->val= key_field->val; |
544 | added_key_field->level= 0; |
545 | added_key_field->optimize= KEY_OPTIMIZE_EQ; |
546 | added_key_field->eq_func= true; |
547 | |
548 | Item *real= key_field->val->real_item(); |
549 | if ((real->type() == Item::FIELD_ITEM) && |
550 | ((Item_field*)real)->field->maybe_null()) |
551 | added_key_field->null_rejecting= true; |
552 | else |
553 | added_key_field->null_rejecting= false; |
554 | |
555 | added_key_field->cond_guard= NULL; |
556 | added_key_field->sj_pred_no= UINT_MAX; |
557 | return; |
558 | } |
559 | |
560 | |
561 | static bool |
562 | add_ext_keyuse_for_splitting(Dynamic_array<KEYUSE_EXT> *ext_keyuses, |
563 | KEY_FIELD *added_key_field, uint key, uint part) |
564 | { |
565 | KEYUSE_EXT keyuse_ext; |
566 | Field *field= added_key_field->field; |
567 | |
568 | JOIN_TAB *tab=field->table->reginfo.join_tab; |
569 | key_map possible_keys=field->get_possible_keys(); |
570 | possible_keys.intersect(field->table->keys_usable_for_splitting); |
571 | tab->keys.merge(possible_keys); |
572 | |
573 | Item_func_eq *eq_item= (Item_func_eq *) (added_key_field->cond); |
574 | keyuse_ext.table= field->table; |
575 | keyuse_ext.val= eq_item->arguments()[1]; |
576 | keyuse_ext.key= key; |
577 | keyuse_ext.keypart=part; |
578 | keyuse_ext.keypart_map= (key_part_map) 1 << part; |
579 | keyuse_ext.used_tables= keyuse_ext.val->used_tables(); |
580 | keyuse_ext.optimize= added_key_field->optimize & KEY_OPTIMIZE_REF_OR_NULL; |
581 | keyuse_ext.ref_table_rows= 0; |
582 | keyuse_ext.null_rejecting= added_key_field->null_rejecting; |
583 | keyuse_ext.cond_guard= added_key_field->cond_guard; |
584 | keyuse_ext.sj_pred_no= added_key_field->sj_pred_no; |
585 | keyuse_ext.validity_ref= 0; |
586 | keyuse_ext.needed_in_prefix= added_key_field->val->used_tables(); |
587 | keyuse_ext.validity_var= false; |
588 | return ext_keyuses->push(keyuse_ext); |
589 | } |
590 | |
591 | |
592 | static int |
593 | sort_ext_keyuse(KEYUSE_EXT *a, KEYUSE_EXT *b) |
594 | { |
595 | if (a->table->tablenr != b->table->tablenr) |
596 | return (int) (a->table->tablenr - b->table->tablenr); |
597 | if (a->key != b->key) |
598 | return (int) (a->key - b->key); |
599 | return (int) (a->keypart - b->keypart); |
600 | } |
601 | |
602 | |
603 | static void |
604 | sort_ext_keyuses(Dynamic_array<KEYUSE_EXT> *keyuses) |
605 | { |
606 | KEYUSE_EXT *first_keyuse= &keyuses->at(0); |
607 | my_qsort(first_keyuse, keyuses->elements(), sizeof(KEYUSE_EXT), |
608 | (qsort_cmp) sort_ext_keyuse); |
609 | } |
610 | |
611 | |
612 | /** |
613 | @brief |
614 | Add info on keyuses usable for splitting into an array |
615 | */ |
616 | |
617 | static bool |
618 | add_ext_keyuses_for_splitting_field(Dynamic_array<KEYUSE_EXT> *ext_keyuses, |
619 | KEY_FIELD *added_key_field) |
620 | { |
621 | Field *field= added_key_field->field; |
622 | TABLE *table= field->table; |
623 | for (uint key= 0; key < table->s->keys; key++) |
624 | { |
625 | if (!(table->keys_usable_for_splitting.is_set(key))) |
626 | continue; |
627 | KEY *key_info= table->key_info + key; |
628 | uint key_parts= table->actual_n_key_parts(key_info); |
629 | KEY_PART_INFO *key_part_info= key_info->key_part; |
630 | for (uint part=0; part < key_parts; part++, key_part_info++) |
631 | { |
632 | if (!field->eq(key_part_info->field)) |
633 | continue; |
634 | if (add_ext_keyuse_for_splitting(ext_keyuses, added_key_field, key, part)) |
635 | return true; |
636 | } |
637 | } |
638 | return false; |
639 | } |
640 | |
641 | |
642 | /* |
643 | @brief |
644 | Cost of the post join operation used in specification of splittable table |
645 | */ |
646 | |
647 | static |
648 | double spl_postjoin_oper_cost(THD *thd, double join_record_count, uint rec_len) |
649 | { |
650 | double cost; |
651 | cost= get_tmp_table_write_cost(thd, join_record_count,rec_len) * |
652 | join_record_count; // cost to fill tmp table |
653 | cost+= get_tmp_table_lookup_cost(thd, join_record_count,rec_len) * |
654 | join_record_count; // cost to perform post join operation used here |
655 | cost+= get_tmp_table_lookup_cost(thd, join_record_count, rec_len) + |
656 | (join_record_count == 0 ? 0 : |
657 | join_record_count * log2 (join_record_count)) * |
658 | SORT_INDEX_CMP_COST; // cost to perform sorting |
659 | return cost; |
660 | } |
661 | |
662 | /** |
663 | @brief |
664 | Add KEYUSE structures that can be usable for splitting |
665 | |
666 | @details |
667 | This function is called only for joins created for potentially |
668 | splittable materialized tables. The function does the following: |
669 | 1. Creates the dynamic array ext_keyuses_for_splitting of KEYUSE_EXT |
670 | structures and fills is with info about all keyuses that |
671 | could be used for splitting. |
672 | 2. Sort the array ext_keyuses_for_splitting for fast access by key |
673 | on certain columns. |
674 | 3. Collects and stores cost and cardinality info on the best execution |
675 | plan that does not use splitting and save this plan together with |
676 | corresponding array of keyuses. |
677 | 4. Expand this array with KEYUSE elements built from the info stored |
678 | in ext_keyuses_for_splitting that could be produced by pushed |
679 | equalities employed for splitting. |
680 | 5. Prepare the extended array of keyuses to be used in the function |
681 | best_access_plan() |
682 | */ |
683 | |
684 | void JOIN::add_keyuses_for_splitting() |
685 | { |
686 | uint i; |
687 | uint idx; |
688 | KEYUSE_EXT *keyuse_ext; |
689 | KEYUSE_EXT keyuse_ext_end; |
690 | double oper_cost; |
691 | uint rec_len; |
692 | uint added_keyuse_count; |
693 | TABLE *table= select_lex->master_unit()->derived->table; |
694 | List_iterator_fast<KEY_FIELD> li(spl_opt_info->added_key_fields); |
695 | KEY_FIELD *added_key_field; |
696 | if (!spl_opt_info->added_key_fields.elements) |
697 | goto err; |
698 | if (!(ext_keyuses_for_splitting= new Dynamic_array<KEYUSE_EXT>)) |
699 | goto err; |
700 | while ((added_key_field= li++)) |
701 | { |
702 | (void) add_ext_keyuses_for_splitting_field(ext_keyuses_for_splitting, |
703 | added_key_field); |
704 | } |
705 | added_keyuse_count= (uint)ext_keyuses_for_splitting->elements(); |
706 | if (!added_keyuse_count) |
707 | goto err; |
708 | sort_ext_keyuses(ext_keyuses_for_splitting); |
709 | bzero((char*) &keyuse_ext_end, sizeof(keyuse_ext_end)); |
710 | if (ext_keyuses_for_splitting->push(keyuse_ext_end)) |
711 | goto err; |
712 | |
713 | spl_opt_info->unsplit_card= join_record_count; |
714 | |
715 | rec_len= table->s->rec_buff_length; |
716 | |
717 | oper_cost= spl_postjoin_oper_cost(thd, join_record_count, rec_len); |
718 | |
719 | spl_opt_info->unsplit_cost= best_positions[table_count-1].read_time + |
720 | oper_cost; |
721 | |
722 | if (!(save_qep= new Join_plan_state(table_count + 1))) |
723 | goto err; |
724 | |
725 | save_query_plan(save_qep); |
726 | |
727 | if (!keyuse.buffer && |
728 | my_init_dynamic_array(&keyuse, sizeof(KEYUSE), 20, 64, |
729 | MYF(MY_THREAD_SPECIFIC))) |
730 | goto err; |
731 | |
732 | if (allocate_dynamic(&keyuse, |
733 | save_qep->keyuse.elements + |
734 | added_keyuse_count)) |
735 | goto err; |
736 | |
737 | memcpy(keyuse.buffer, |
738 | save_qep->keyuse.buffer, |
739 | (size_t) save_qep->keyuse.elements * keyuse.size_of_element); |
740 | keyuse.elements= save_qep->keyuse.elements; |
741 | |
742 | keyuse_ext= &ext_keyuses_for_splitting->at(0); |
743 | idx= save_qep->keyuse.elements; |
744 | for (i=0; i < added_keyuse_count; i++, keyuse_ext++, idx++) |
745 | { |
746 | set_dynamic(&keyuse, (KEYUSE *) keyuse_ext, idx); |
747 | KEYUSE *added_keyuse= ((KEYUSE *) (keyuse.buffer)) + idx; |
748 | added_keyuse->validity_ref= &keyuse_ext->validity_var; |
749 | } |
750 | |
751 | if (sort_and_filter_keyuse(thd, &keyuse, true)) |
752 | goto err; |
753 | optimize_keyuse(this, &keyuse); |
754 | |
755 | for (uint i= 0; i < table_count; i++) |
756 | { |
757 | JOIN_TAB *tab= join_tab + i; |
758 | map2table[tab->table->tablenr]= tab; |
759 | } |
760 | |
761 | return; |
762 | |
763 | err: |
764 | if (save_qep) |
765 | restore_query_plan(save_qep); |
766 | table->deny_splitting(); |
767 | return; |
768 | } |
769 | |
770 | |
771 | /** |
772 | @brief |
773 | Add KEYUSE structures that can be usable for splitting of this joined table |
774 | */ |
775 | |
776 | void JOIN_TAB::add_keyuses_for_splitting() |
777 | { |
778 | DBUG_ASSERT(table->spl_opt_info != NULL); |
779 | SplM_opt_info *spl_opt_info= table->spl_opt_info; |
780 | spl_opt_info->join->add_keyuses_for_splitting(); |
781 | } |
782 | |
783 | |
784 | /* |
785 | @brief |
786 | Find info on the splitting plan by the splitting key |
787 | */ |
788 | |
789 | SplM_plan_info *SplM_opt_info::find_plan(TABLE *table, uint key, uint parts) |
790 | { |
791 | List_iterator_fast<SplM_plan_info> li(plan_cache); |
792 | SplM_plan_info *spl_plan; |
793 | while ((spl_plan= li++)) |
794 | { |
795 | if (spl_plan->table == table && |
796 | spl_plan->key == key && |
797 | spl_plan->parts == parts) |
798 | break; |
799 | } |
800 | return spl_plan; |
801 | } |
802 | |
803 | |
804 | /* |
805 | @breaf |
806 | Enable/Disable a keyuses that can be used for splitting |
807 | */ |
808 | |
809 | static |
810 | void reset_validity_vars_for_keyuses(KEYUSE_EXT *key_keyuse_ext_start, |
811 | TABLE *table, uint key, |
812 | table_map remaining_tables, |
813 | bool validity_val) |
814 | { |
815 | KEYUSE_EXT *keyuse_ext= key_keyuse_ext_start; |
816 | do |
817 | { |
818 | if (!(keyuse_ext->needed_in_prefix & remaining_tables)) |
819 | { |
820 | /* |
821 | The enabling/disabling flags are set just in KEYUSE_EXT structures. |
822 | Yet keyuses that are used by best_access_path() have pointers |
823 | to these flags. |
824 | */ |
825 | keyuse_ext->validity_var= validity_val; |
826 | } |
827 | keyuse_ext++; |
828 | } |
829 | while (keyuse_ext->key == key && keyuse_ext->table == table); |
830 | } |
831 | |
832 | |
833 | /** |
834 | @brief |
835 | Choose the best splitting to extend the evaluated partial join |
836 | |
837 | @param |
838 | record_count estimated cardinality of the extended partial join |
839 | remaining_tables tables not joined yet |
840 | |
841 | @details |
842 | This function is called during the search for the best execution |
843 | plan of the join that contains this table T. The function is called |
844 | every time when the optimizer tries to extend a partial join by |
845 | joining it with table T. Depending on what tables are already in the |
846 | partial join different equalities usable for splitting can be pushed |
847 | into T. The function evaluates different variants and chooses the |
848 | best one. Then the function finds the plan for the materializing join |
849 | with the chosen equality conditions pushed into it. If the cost of the |
850 | plan turns out to be less than the cost of the best plan without |
851 | splitting the function set it as the true plan of materialization |
852 | of the table T. |
853 | The function caches the found plans for materialization of table T |
854 | together if the info what key was used for splitting. Next time when |
855 | the optimizer prefers to use the same key the plan is taken from |
856 | the cache of plans |
857 | |
858 | @retval |
859 | Pointer to the info on the found plan that employs the pushed equalities |
860 | if the plan has been chosen, NULL - otherwise. |
861 | */ |
862 | |
863 | SplM_plan_info * JOIN_TAB::choose_best_splitting(double record_count, |
864 | table_map remaining_tables) |
865 | { |
866 | SplM_opt_info *spl_opt_info= table->spl_opt_info; |
867 | DBUG_ASSERT(spl_opt_info != NULL); |
868 | JOIN *join= spl_opt_info->join; |
869 | THD *thd= join->thd; |
870 | table_map tables_usable_for_splitting= |
871 | spl_opt_info->tables_usable_for_splitting; |
872 | KEYUSE_EXT *keyuse_ext= &join->ext_keyuses_for_splitting->at(0); |
873 | KEYUSE_EXT *UNINIT_VAR(best_key_keyuse_ext_start); |
874 | TABLE *best_table= 0; |
875 | double best_rec_per_key= DBL_MAX; |
876 | SplM_plan_info *spl_plan= 0; |
877 | uint best_key= 0; |
878 | uint best_key_parts= 0; |
879 | |
880 | /* |
881 | Check whether there are keys that can be used to join T employing splitting |
882 | and if so, select the best out of such keys |
883 | */ |
884 | for (uint tablenr= 0; tablenr < join->table_count; tablenr++) |
885 | { |
886 | if (!((1ULL << tablenr) & tables_usable_for_splitting)) |
887 | continue; |
888 | JOIN_TAB *tab= join->map2table[tablenr]; |
889 | TABLE *table= tab->table; |
890 | do |
891 | { |
892 | uint key= keyuse_ext->key; |
893 | KEYUSE_EXT *key_keyuse_ext_start= keyuse_ext; |
894 | key_part_map found_parts= 0; |
895 | do |
896 | { |
897 | if (keyuse_ext->needed_in_prefix & remaining_tables) |
898 | { |
899 | keyuse_ext++; |
900 | continue; |
901 | } |
902 | if (!(keyuse_ext->keypart_map & found_parts)) |
903 | { |
904 | if ((!found_parts && !keyuse_ext->keypart) || |
905 | (found_parts && ((keyuse_ext->keypart_map >> 1) & found_parts))) |
906 | found_parts|= keyuse_ext->keypart_map; |
907 | else |
908 | { |
909 | do |
910 | { |
911 | keyuse_ext++; |
912 | } |
913 | while (keyuse_ext->key == key && keyuse_ext->table == table); |
914 | break; |
915 | } |
916 | } |
917 | KEY *key_info= table->key_info + key; |
918 | double rec_per_key= |
919 | key_info->actual_rec_per_key(keyuse_ext->keypart); |
920 | if (rec_per_key < best_rec_per_key) |
921 | { |
922 | best_table= keyuse_ext->table; |
923 | best_key= keyuse_ext->key; |
924 | best_key_parts= keyuse_ext->keypart + 1; |
925 | best_rec_per_key= rec_per_key; |
926 | best_key_keyuse_ext_start= key_keyuse_ext_start; |
927 | } |
928 | keyuse_ext++; |
929 | } |
930 | while (keyuse_ext->key == key && keyuse_ext->table == table); |
931 | } |
932 | while (keyuse_ext->table == table); |
933 | } |
934 | spl_opt_info->last_plan= 0; |
935 | if (best_table) |
936 | { |
937 | /* |
938 | The key for splitting was chosen, look for the plan for this key |
939 | in the cache |
940 | */ |
941 | spl_plan= spl_opt_info->find_plan(best_table, best_key, best_key_parts); |
942 | if (!spl_plan && |
943 | (spl_plan= (SplM_plan_info *) thd->alloc(sizeof(SplM_plan_info))) && |
944 | (spl_plan->best_positions= |
945 | (POSITION *) thd->alloc(sizeof(POSITION) * join->table_count)) && |
946 | !spl_opt_info->plan_cache.push_back(spl_plan)) |
947 | { |
948 | /* |
949 | The plan for the chosen key has not been found in the cache. |
950 | Build a new plan and save info on it in the cache |
951 | */ |
952 | table_map all_table_map= (1 << join->table_count) - 1; |
953 | reset_validity_vars_for_keyuses(best_key_keyuse_ext_start, best_table, |
954 | best_key, remaining_tables, true); |
955 | choose_plan(join, all_table_map & ~join->const_table_map); |
956 | spl_plan->keyuse_ext_start= best_key_keyuse_ext_start; |
957 | spl_plan->table= best_table; |
958 | spl_plan->key= best_key; |
959 | spl_plan->parts= best_key_parts; |
960 | spl_plan->split_sel= best_rec_per_key / |
961 | (spl_opt_info->unsplit_card ? |
962 | spl_opt_info->unsplit_card : 1); |
963 | |
964 | uint rec_len= table->s->rec_buff_length; |
965 | |
966 | double split_card= spl_opt_info->unsplit_card * spl_plan->split_sel; |
967 | double oper_cost= split_card * |
968 | spl_postjoin_oper_cost(thd, split_card, rec_len); |
969 | spl_plan->cost= join->best_positions[join->table_count-1].read_time + |
970 | + oper_cost; |
971 | |
972 | memcpy((char *) spl_plan->best_positions, |
973 | (char *) join->best_positions, |
974 | sizeof(POSITION) * join->table_count); |
975 | reset_validity_vars_for_keyuses(best_key_keyuse_ext_start, best_table, |
976 | best_key, remaining_tables, false); |
977 | } |
978 | if (spl_plan) |
979 | { |
980 | if(record_count * spl_plan->cost < spl_opt_info->unsplit_cost) |
981 | { |
982 | /* |
983 | The best plan that employs splitting is cheaper than |
984 | the plan without splitting |
985 | */ |
986 | spl_opt_info->last_plan= spl_plan; |
987 | } |
988 | } |
989 | } |
990 | |
991 | /* Set the cost of the preferred materialization for this partial join */ |
992 | records= (ha_rows)spl_opt_info->unsplit_card; |
993 | spl_plan= spl_opt_info->last_plan; |
994 | if (spl_plan) |
995 | { |
996 | startup_cost= record_count * spl_plan->cost; |
997 | records= (ha_rows) (records * spl_plan->split_sel); |
998 | } |
999 | else |
1000 | startup_cost= spl_opt_info->unsplit_cost; |
1001 | return spl_plan; |
1002 | } |
1003 | |
1004 | |
1005 | /** |
1006 | @brief |
1007 | Inject equalities for splitting used by the materialization join |
1008 | |
1009 | @param |
1010 | remaining_tables used to filter out the equalities that cannot |
1011 | be pushed. |
1012 | |
1013 | @details |
1014 | This function is called by JOIN_TAB::fix_splitting that is used |
1015 | to fix the chosen splitting of a splittable materialized table T |
1016 | in the final query execution plan. In this plan the table T |
1017 | is joined just before the 'remaining_tables'. So all equalities |
1018 | usable for splitting whose right parts do not depend on any of |
1019 | remaining tables can be pushed into join for T. |
1020 | The function also marks the select that specifies T as |
1021 | UNCACHEABLE_DEPENDENT_INJECTED. |
1022 | |
1023 | @retval |
1024 | false on success |
1025 | true on failure |
1026 | */ |
1027 | |
1028 | bool JOIN::inject_best_splitting_cond(table_map remaining_tables) |
1029 | { |
1030 | Item *inj_cond= 0; |
1031 | List<Item> inj_cond_list; |
1032 | List_iterator<KEY_FIELD> li(spl_opt_info->added_key_fields); |
1033 | KEY_FIELD *added_key_field; |
1034 | while ((added_key_field= li++)) |
1035 | { |
1036 | if (remaining_tables & added_key_field->val->used_tables()) |
1037 | continue; |
1038 | if (inj_cond_list.push_back(added_key_field->cond, thd->mem_root)) |
1039 | return true; |
1040 | } |
1041 | DBUG_ASSERT(inj_cond_list.elements); |
1042 | switch (inj_cond_list.elements) { |
1043 | case 1: |
1044 | inj_cond= inj_cond_list.head(); break; |
1045 | default: |
1046 | inj_cond= new (thd->mem_root) Item_cond_and(thd, inj_cond_list); |
1047 | if (!inj_cond) |
1048 | return true; |
1049 | } |
1050 | if (inj_cond) |
1051 | inj_cond->fix_fields(thd,0); |
1052 | |
1053 | if (inject_cond_into_where(inj_cond)) |
1054 | return true; |
1055 | |
1056 | select_lex->uncacheable|= UNCACHEABLE_DEPENDENT_INJECTED; |
1057 | st_select_lex_unit *unit= select_lex->master_unit(); |
1058 | unit->uncacheable|= UNCACHEABLE_DEPENDENT_INJECTED; |
1059 | |
1060 | return false; |
1061 | } |
1062 | |
1063 | |
1064 | /** |
1065 | @brief |
1066 | Fix the splitting chosen for a splittable table in the final query plan |
1067 | |
1068 | @param |
1069 | spl_plan info on the splitting plan chosen for the splittable table T |
1070 | remaining_tables the table T is joined just before these tables |
1071 | |
1072 | @details |
1073 | If in the final query plan the optimizer has chosen a splitting plan |
1074 | then the function sets this plan as the final execution plan to |
1075 | materialized the table T. Otherwise the plan that does not use |
1076 | splitting is set for the materialization. |
1077 | |
1078 | @retval |
1079 | false on success |
1080 | true on failure |
1081 | */ |
1082 | |
1083 | bool JOIN_TAB::fix_splitting(SplM_plan_info *spl_plan, |
1084 | table_map remaining_tables) |
1085 | { |
1086 | SplM_opt_info *spl_opt_info= table->spl_opt_info; |
1087 | DBUG_ASSERT(table->spl_opt_info != 0); |
1088 | JOIN *md_join= spl_opt_info->join; |
1089 | if (spl_plan) |
1090 | { |
1091 | memcpy((char *) md_join->best_positions, |
1092 | (char *) spl_plan->best_positions, |
1093 | sizeof(POSITION) * md_join->table_count); |
1094 | if (md_join->inject_best_splitting_cond(remaining_tables)) |
1095 | return true; |
1096 | /* |
1097 | This is called for a proper work of JOIN::get_best_combination() |
1098 | called for the join that materializes T |
1099 | */ |
1100 | reset_validity_vars_for_keyuses(spl_plan->keyuse_ext_start, |
1101 | spl_plan->table, |
1102 | spl_plan->key, |
1103 | remaining_tables, |
1104 | true); |
1105 | } |
1106 | else |
1107 | { |
1108 | md_join->restore_query_plan(md_join->save_qep); |
1109 | } |
1110 | return false; |
1111 | } |
1112 | |
1113 | |
1114 | /** |
1115 | @brief |
1116 | Fix the splittings chosen splittable tables in the final query plan |
1117 | |
1118 | @details |
1119 | The function calls JOIN_TAB::fix_splittins for all potentially |
1120 | splittable tables in this join to set all final materialization |
1121 | plans chosen for these tables. |
1122 | |
1123 | @retval |
1124 | false on success |
1125 | true on failure |
1126 | */ |
1127 | |
1128 | bool JOIN::fix_all_splittings_in_plan() |
1129 | { |
1130 | table_map prev_tables= 0; |
1131 | table_map all_tables= (1 << table_count) - 1; |
1132 | for (uint tablenr= 0; tablenr < table_count; tablenr++) |
1133 | { |
1134 | POSITION *cur_pos= &best_positions[tablenr]; |
1135 | JOIN_TAB *tab= cur_pos->table; |
1136 | if (tablenr >= const_tables && tab->table->is_splittable()) |
1137 | { |
1138 | SplM_plan_info *spl_plan= cur_pos->spl_plan; |
1139 | if (tab->fix_splitting(spl_plan, all_tables & ~prev_tables)) |
1140 | return true; |
1141 | } |
1142 | prev_tables|= tab->table->map; |
1143 | } |
1144 | return false; |
1145 | } |
1146 | |