opt_split.cc source code [MariaDB/sql/opt_split.cc]

1	/*
2	Copyright (c) 2017 MariaDB
3
4	This program is free software; you can redistribute it and/or modify
5	it under the terms of the GNU General Public License as published by
6	the Free Software Foundation; version 2 of the License.
7
8	This program is distributed in the hope that it will be useful,
9	but WITHOUT ANY WARRANTY; without even the implied warranty of
10	MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11	GNU General Public License for more details.
12
13	You should have received a copy of the GNU General Public License
14	along with this program; if not, write to the Free Software
15	Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA /*
16
17	/*
18	This file contains functions to support the splitting technique.
19	This optimization technique can be applied to equi-joins involving
20	materialized tables such as materialized views, materialized derived tables
21	and materialized CTEs. The technique also could be applied to materialized
22	semi-joins though the code below does not support this usage yet.
23
24	Here are the main ideas behind this technique that we'll call SM optimization
25	(SplitMaterialization).
26
27	Consider the query
28	SELECT t1.a, t.min
29	FROM t1, (SELECT t2.a, MIN(t2.b) as min FROM t2 GROUP BY t2.a) t
30	WHERE t1.a = t.a and t1.b < const
31
32	Re-write the query into
33	SELECT t1.a, t.min
34	FROM t1, LATERAL (SELECT t2.a, MIN(t2.b) as min
35	FROM t2 WHERE t2.a = t1.a GROUP BY t2.a) t
36	WHERE t1.b < const
37
38	The execution of the original query (Q1) does the following:
39	1. Executes the query in the specification of the derived table
40	and puts the result set into a temporary table with an index
41	on the first column.
42	2. Joins t1 with the temporary table using the its index.
43
44	The execution of the transformed query (Q1R) follows these steps:
45	1. For each row of t1 where t1.b < const a temporary table
46	containing all rows of of t2 with t2.a = t1.a is created
47	2. If there are any rows in the temporary table aggregation
48	is performed for them
49	3. The result of the aggregation is joined with t1.
50
51	The second execution can win if:
52	a) There is an efficient way to select rows of t2 for which t2.a = t1.a
53	(For example if there is an index on t2.a)
54	and
55	b) The number of temporary tables created for partitions
56	is much smaller that the total number of partitions
57
58	It should be noted that for the transformed query aggregation
59	for a partition may be performed several times.
60
61	As we can see the optimization basically splits table t2 into
62	partitions and performs aggregation over each of them
63	independently.
64
65	If we have only one equi-join condition then we either push it as
66	for Q1R or we don't. In a general case we may have much more options.
67	Consider the query (Q3)
68	SELECT
69	FROM t1,t2 (SELECT t3.a, t3.b, MIN(t3.c) as min
70	FROM t3 GROUP BY a,b) t
71	WHERE t.a = t1.a AND t.b = t2.b
72	AND t1.c < c1 and t2.c < c2
73	AND P(t1,t2);
74	(P(t1,t2) designates some additional conditions over columns of t1,t2).
75
76	Assuming that there indexes on t3(a,b) and t3(b) here we have several
77	reasonable options to push equi-join conditions into the derived.
78	All these options should be taken into account when the optimizer
79	evaluates different join orders. When the join order (t1,t,t2) is
80	evaluated there is only one way of splitting : to push the condition
81	t.a = t1.a into t. With the join order (t2,t,t1) only the condition
82	t.b = t2.b can be pushed. When the join orders (t1,t2,t) and (t2,t1,t)
83	are evaluated then the optimizer should consider pushing t.a = t1.a,
84	t.b = t2.b and (t.a = t1.a AND t.b = t2.b) to choose the best condition
85	for splitting. Apparently here last condition is the best one because
86	it provides the miximum possible number of partitions.
87
88	If we dropped the index on t3(a,b) and created the index on t3(a) instead
89	then we would have two options for splitting: to push t.a = t1.a or to
90	push t.b = t2.b. If the selectivity of the index t3(a) is better than
91	the selectivity of t3(b) then the first option is preferred.
92
93	Although the condition (t.a = t1.a AND t.b = t2.b) provides a better
94	splitting than the condition t.a = t1.a the latter will be used for
95	splitting if the execution plan with the join order (t1,t,t2) turns out
96	to be the cheapest one. It's quite possible when the join condition
97	P(t1,t2) has a bad selectivity.
98
99	Whenever the optimizer evaluates the cost of using a splitting it
100	compares it with the cost of materialization without splitting.
101
102	If we just drop the index on t3(a,b) the chances that the splitting
103	will be used becomes much lower but they still exists providing that
104	the fanout of the partial join of t1 and t2 is small enough.
105	*/
106
107	/*
108	Splitting can be applied to a materialized table specified by the query
109	with post-join operations that require partitioning of the result set produced
110	by the join expression used in the FROM clause the query such as GROUP BY
111	operation and window function operation. In any of these cases the post-join
112	operation can be executed independently for any partition only over the rows
113	of this partition. Also if the set of all partitions is divided into disjoint
114	subsets the operation can applied to each subset independently. In this case
115	all rows are first partitioned into the groups each of which contains all the
116	rows from the partitions belonging the same subset and then each group
117	is subpartitioned into groups in the the post join operation.
118
119	The set of all rows belonging to the union of several partitions is called
120	here superpartition. If a grouping operation is defined by the list
121	e_1,...,e_n then any set S = {e_i1,...,e_ik} can be used to devide all rows
122	into superpartions such that for any two rows r1, r2 the following holds:
123	e_ij(r1) = e_ij(r2) for each e_ij from S. We use the splitting technique
124	only if S consists of references to colums of the joined tables.
125	For example if the GROUP BY list looks like this a, g(b), c we can consider
126	applying the splitting technique to the superpartitions defined by {a,c},
127	{a}, {c} (a and c here may be the references to the columns from different
128	tables).
129	*/
130
131	/*
132	The following describes when and how the optimizer decides whether it
133	makes sense to employ the splitting technique.
134
135	1. For each instance of a materialized table (derived/view/CTE) it is
136	checked that it is potentially splittable. Now it is done right after the
137	execution plan for the select specifying this table has been chosen.
138
139	2. Any potentially splittable materialized table T is subject to two-phase
140	optimization. It means that the optimizer first builds the best execution
141	plan for join that specifies T. Then the control is passed back to the
142	optimization process of the embedding select Q. After the execution plan
143	for Q has been chosen the optimizer finishes the optimization of the join
144	specifying T.
145
146	3. When the optimizer builds the container with the KEYUSE structures
147	for the join of embedding select it detects the equi-join conditions
148	PC that potentially could be pushed into a potentially splittable
149	materialized table T. The collected information about such conditions
150	is stored together with other facts on potential splittings for table T.
151
152	4. When the optimizer starts looking for the best execution plan for the
153	embedding select Q for each potentially splittable materialized table T
154	it creates special KEYUSE structures for pushable equi-join conditions
155	PC. These structures are used to add new elements to the container
156	of KEYUSE structures built for T. The specifics of these elements is
157	that they can be ebabled and disabled during the process of choosing
158	the best plan for Q.
159
160	5. When the optimizer extends a partial join order with a potentially
161	splittable materialized table T (in function best_access_path) it
162	first evaluates a new execution plan for the modified specification
163	of T that adds all equi-join conditions that can be pushed with
164	current join prefix to the WHERE conditions of the original
165	specification of T. If the cost of the new plan is better than the
166	the cost of the original materialized table then the optimizer
167	prefers to use splitting for the current join prefix. As the cost
168	of the plan depends only on the pushed conditions it makes sense
169	to cache this plan for other prefixes.
170
171	6. The optimizer takes into account the cost of splitting / materialization
172	of a potentially splittable materialized table T as a startup cost
173	to access table T.
174
175	7. When the optimizer finally chooses the best execution plan for
176	the embedding select Q and this plan prefers using splitting
177	for table T with pushed equi-join conditions PC then the execution
178	plan for the underlying join with these conditions is chosen for T.
179	*/
180
181	/*
182	The implementation of the splitting technique below allows to apply
183	the technique only to a materialized derived table / view / CTE whose
184	specification is either a select with GROUP BY or a non-grouping select
185	with window functions that share the same PARTITION BY list.
186	*/
187
188	#include "mariadb.h"
189	#include "sql_select.h"
190
191	/ Info on a splitting field /
192	struct SplM_field_info
193	{
194	/ Splitting field in the materialized table T /
195	Field *mat_field;
196	/ The item from the select list of the specification of T /
197	Item *producing_item;
198	/ The corresponding splitting field from the specification of T /
199	Field *underlying_field;
200	};
201
202
203	/ Info on the splitting execution plan saved in SplM_opt_info::cache /
204	struct SplM_plan_info
205	{
206	/ The cached splitting execution plan P /
207	struct st_position *best_positions;
208	/ The cost of the above plan /
209	double cost;
210	/ Selectivity of splitting used in P /
211	double split_sel;
212	/ For fast search of KEYUSE_EXT elements used for splitting in P /
213	struct KEYUSE_EXT *keyuse_ext_start;
214	/ The tables that contains the fields used for splitting in P /
215	TABLE *table;
216	/ The number of the key from 'table' used for splitting in P /
217	uint key;
218	/ Number of the components of 'key' used for splitting in P /
219	uint parts;
220	};
221
222
223	/*
224	The structure contains the information that is used by the optimizer
225	for potentially splittable materialization of T that is a materialized
226	derived_table / view / CTE
227	*/
228	class SplM_opt_info : public Sql_alloc
229	{
230	public:
231	/ The join for the select specifying T /
232	JOIN *join;
233	/ The map of tables from 'join' whose columns can be used for partitioning /
234	table_map tables_usable_for_splitting;
235	/ Info about the fields of the joined tables usable for splitting /
236	SplM_field_info *spl_fields;
237	/ The number of elements in the above list /
238	uint spl_field_cnt;
239	/ Contains the structures to generate all KEYUSEs for pushable equalities /
240	List<KEY_FIELD> added_key_fields;
241	/ The cache of evaluated execution plans for 'join' with pushed equalities /
242	List<SplM_plan_info> plan_cache;
243	/ Cost of best execution plan for join when nothing is pushed /
244	double unsplit_cost;
245	/ Cardinality of T when nothing is pushed /
246	double unsplit_card;
247	/ Lastly evaluated execution plan for 'join' with pushed equalities /
248	SplM_plan_info *last_plan;
249
250	SplM_plan_info find_plan(TABLE table, uint key, uint parts);
251	};
252
253
254	void TABLE::set_spl_opt_info(SplM_opt_info *spl_info)
255	{
256	if (spl_info)
257	spl_info->join->spl_opt_info= spl_info;
258	spl_opt_info= spl_info;
259	}
260
261
262	void TABLE::deny_splitting()
263	{
264	DBUG_ASSERT(spl_opt_info != NULL);
265	spl_opt_info->join->spl_opt_info= NULL;
266	spl_opt_info= NULL;
267	}
268
269
270	/ This structure is auxiliary and used only in the function that follows it /
271	struct SplM_field_ext_info: public SplM_field_info
272	{
273	uint item_no;
274	bool is_usable_for_ref_access;
275	};
276
277
278	/**
279	@brief
280	Check whether this join is one for potentially splittable materialized table
281
282	@details
283	The function checks whether this join is for select that specifies
284	a potentially splittable materialized table T. If so, the collected
285	info on potential splittability of T is attached to the field spl_opt_info
286	of the TABLE structure for T.
287
288	The function returns a positive answer if the following holds:
289	1. the optimizer switch 'split_materialized' is set 'on'
290	2. the select owning this join specifies a materialized derived/view/cte T
291	3. this is the only select in the specification of T
292	4. condition pushdown is not prohibited into T
293	5. T is not recursive
294	6. not all of this join are constant or optimized away
295	7. T is either
296	7.1. a grouping table with GROUP BY list P
297	or
298	7.2. a non-grouping table with window functions over the same non-empty
299	partition specified by the PARTITION BY list P
300	8. P contains some references on the columns of the joined tables C
301	occurred also in the select list of this join
302	9. There are defined some keys usable for ref access of fields from C
303	with available statistics.
304
305	@retval
306	true if the answer is positive
307	false otherwise
308	*/
309
310	bool JOIN::check_for_splittable_materialized()
311	{
312	ORDER *partition_list= `0`;
313	st_select_lex_unit *unit= select_lex->master_unit();
314	TABLE_LIST *derived= unit->derived;
315	if (!(optimizer_flag(thd, OPTIMIZER_SWITCH_SPLIT_MATERIALIZED)) \|\| // !(1)
316	!(derived && derived->is_materialized_derived()) \|\| // !(2)
317	(unit->first_select()->next_select()) \|\| // !(3)
318	(derived->prohibit_cond_pushdown) \|\| // !(4)
319	(derived->is_recursive_with_table()) \|\| // !(5)
320	(table_count == `0` \|\| const_tables == top_join_tab_count)) // !(6)
321	return false;
322	if (group_list) // (7.1)
323	{
324	if (!select_lex->have_window_funcs())
325	partition_list= group_list;
326	}
327	else if (select_lex->have_window_funcs() &&
328	select_lex->window_specs.elements == `1`) // (7.2)
329	{
330	partition_list=
331	select_lex->window_specs.head()->partition_list->first;
332	}
333	if (!partition_list)
334	return false;
335
336	ORDER *ord;
337	Dynamic_array<SplM_field_ext_info> candidates;
338
339	/*
340	Select from partition_list all candidates for splitting.
341	A candidate must be
342	- field item or refer to such (8.1)
343	- item mentioned in the select list (8.2)
344	Put info about such candidates into the array candidates
345	*/
346	table_map usable_tables= `0`; // tables that contains the candidate
347	for (ord= partition_list; ord; ord= ord->next)
348	{
349	Item ord_item= ord->item;
350	if (ord_item->real_item()->type() != Item::FIELD_ITEM) // !(8.1)
351	continue;
352
353	Field ord_field= ((Item_field ) (ord_item->real_item()))->field;
354
355	/ Ignore fields from of inner tables of outer joins /
356	TABLE_LIST *tbl= ord_field->table->pos_in_table_list;
357	if (tbl->is_inner_table_of_outer_join())
358	continue;
359
360	List_iterator<Item> li(fields_list);
361	Item *item;
362	uint item_no= `0`;
363	while ((item= li ++))
364	{
365	if ((ord->item)->eq(item, `0`)) // (8.2)*
366	{
367	SplM_field_ext_info new_elem;
368	new_elem.producing_item= item;
369	new_elem.item_no= item_no;
370	new_elem.mat_field= derived->table->field[item_no];
371	new_elem.underlying_field= ord_field;
372	new_elem.is_usable_for_ref_access= false;
373	candidates.push(new_elem);
374	usable_tables\|= ord_field->table->map;
375	break;
376	}
377	item_no++;
378	}
379	}
380	if (candidates.elements() == `0`) // no candidates satisfying (8.1) && (8.2)
381	return false;
382
383	/*
384	For each table from this join find the keys that can be used for ref access
385	of the fields mentioned in the 'array candidates'
386	*/
387
388	SplM_field_ext_info *cand;
389	SplM_field_ext_info *cand_start= &candidates.at(`0`);
390	SplM_field_ext_info *cand_end= cand_start + candidates.elements();
391
392	for (JOIN_TAB *tab= join_tab;
393	tab < join_tab + top_join_tab_count; tab++)
394	{
395	TABLE *table= tab->table;
396	if (!(table->map & usable_tables))
397	continue;
398
399	table->keys_usable_for_splitting.clear_all();
400	uint i;
401	for (i= `0`; i < table->s->keys; i++)
402	{
403	if (!table->keys_in_use_for_query.is_set(i))
404	continue;
405	KEY *key_info= table->key_info + i;
406	uint key_parts= table->actual_n_key_parts(key_info);
407	uint usable_kp_cnt= `0`;
408	for ( ; usable_kp_cnt < key_parts; usable_kp_cnt++)
409	{
410	if (key_info->actual_rec_per_key(usable_kp_cnt) == `0`)
411	break;
412	int fldnr= key_info->key_part[usable_kp_cnt].fieldnr;
413
414	for (cand= cand_start; cand < cand_end; cand++)
415	{
416	if (cand->underlying_field->field_index + `1` == fldnr)
417	{
418	cand->is_usable_for_ref_access= true;
419	break;
420	}
421	}
422	if (cand == cand_end)
423	break;
424	}
425	if (usable_kp_cnt)
426	table->keys_usable_for_splitting.set_bit(i);
427	}
428	}
429
430	/ Count the candidate fields that can be accessed by ref /
431	uint spl_field_cnt= (uint)candidates.elements();
432	for (cand= cand_start; cand < cand_end; cand++)
433	{
434	if (!cand->is_usable_for_ref_access)
435	spl_field_cnt--;
436	}
437
438	if (!spl_field_cnt) // No candidate field can be accessed by ref => !(9)
439	return false;
440
441	/*
442	Create a structure of the type SplM_opt_info and fill it with
443	the collected info on potential splittability of T
444	*/
445	SplM_opt_info spl_opt_info= new* (thd->mem_root) SplM_opt_info ();
446	SplM_field_info *spl_field=
447	(SplM_field_info ) (thd->calloc(sizeof(SplM_field_info)
448	spl_field_cnt));
449
450	if (!(spl_opt_info && spl_field)) // consider T as not good for splitting
451	return false;
452
453	spl_opt_info->join= this;
454	spl_opt_info->tables_usable_for_splitting= `0`;
455	spl_opt_info->spl_field_cnt= spl_field_cnt;
456	spl_opt_info->spl_fields= spl_field;
457	for (cand= cand_start; cand < cand_end; cand++)
458	{
459	if (!cand->is_usable_for_ref_access)
460	continue;
461	spl_field->producing_item= cand->producing_item;
462	spl_field->underlying_field= cand->underlying_field;
463	spl_field->mat_field= cand->mat_field;
464	spl_opt_info->tables_usable_for_splitting\|=
465	cand->underlying_field->table->map;
466	spl_field++;
467	}
468
469	/ Attach this info to the table T /
470	derived->table->set_spl_opt_info(spl_opt_info);
471
472	return true;
473	}
474
475
476	/**
477	@brief
478	Collect info on KEY_FIELD usable for splitting
479
480	@param
481	key_field KEY_FIELD to collect info on
482
483	@details
484	The function assumes that this table is potentially splittable.
485	The function checks whether the KEY_FIELD structure key_field built for
486	this table was created for a splitting field f. If so, the function does
487	the following using info from key_field:
488	1. Builds an equality of the form f = key_field->val that could be
489	pushed into this table.
490	2. Creates a new KEY_FIELD structure for this equality and stores
491	a reference to this structure in this->spl_opt_info.
492	*/
493
494	void TABLE::add_splitting_info_for_key_field(KEY_FIELD *key_field)
495	{
496	DBUG_ASSERT(spl_opt_info != NULL);
497	JOIN *join= spl_opt_info->join;
498	Field *field= key_field->field;
499	SplM_field_info *spl_field= spl_opt_info->spl_fields;
500	uint i= spl_opt_info->spl_field_cnt;
501	for ( ; i; i--, spl_field++)
502	{
503	if (spl_field->mat_field == field)
504	break;
505	}
506	if (!i) // field is not usable for splitting
507	return;
508
509	/*
510	Any equality condition that can be potentially pushed into the
511	materialized derived table is constructed now though later it may turn out
512	that it is not needed, because it is not used for splitting.
513	The reason for this is that the failure to construct it when it has to be
514	injected causes denial for further processing of the query.
515	Formally this equality is needed in the KEY_FIELD structure constructed
516	here that will be used to generate additional keyuses usable for splitting.
517	However key_field.cond could be used for this purpose (see implementations
518	of virtual function can_optimize_keypart_ref()).
519
520	The condition is built in such a form that it can be added to the WHERE
521	condition of the select that specifies this table.
522	*/
523	THD *thd= in_use;
524	Item *left_item= spl_field->producing_item->build_clone(thd);
525	Item *right_item= key_field->val->build_clone(thd);
526	Item_func_eq *eq_item= `0`;
527	if (left_item && right_item)
528	{
529	right_item->walk(&Item::set_fields_as_dependent_processor,
530	false, join->select_lex);
531	right_item->update_used_tables();
532	eq_item= new (thd->mem_root) Item_func_eq (thd, left_item, right_item);
533	}
534	if (!eq_item)
535	return;
536	KEY_FIELD *added_key_field=
537	(KEY_FIELD ) thd->alloc(sizeof*(KEY_FIELD));
538	if (!added_key_field \|\|
539	spl_opt_info->added_key_fields.push_back(added_key_field,thd->mem_root))
540	return;
541	added_key_field->field= spl_field->underlying_field;
542	added_key_field->cond= eq_item;
543	added_key_field->val= key_field->val;
544	added_key_field->level= `0`;
545	added_key_field->optimize= KEY_OPTIMIZE_EQ;
546	added_key_field->eq_func= true;
547
548	Item *real= key_field->val->real_item();
549	if ((real->type() == Item::FIELD_ITEM) &&
550	((Item_field*)real)->field->maybe_null())
551	added_key_field->null_rejecting= true;
552	else
553	added_key_field->null_rejecting= false;
554
555	added_key_field->cond_guard= NULL;
556	added_key_field->sj_pred_no= UINT_MAX;
557	return;
558	}
559
560
561	static bool
562	add_ext_keyuse_for_splitting(Dynamic_array<KEYUSE_EXT> *ext_keyuses,
563	KEY_FIELD *added_key_field, uint key, uint part)
564	{
565	KEYUSE_EXT keyuse_ext;
566	Field *field= added_key_field->field;
567
568	JOIN_TAB *tab=field->table->reginfo.join_tab;
569	key_map possible_keys=field->get_possible_keys();
570	possible_keys.intersect(field->table->keys_usable_for_splitting);
571	tab->keys.merge(possible_keys);
572
573	Item_func_eq eq_item= (Item_func_eq ) (added_key_field->cond);
574	keyuse_ext.table= field->table;
575	keyuse_ext.val= eq_item->arguments()[`1`];
576	keyuse_ext.key= key;
577	keyuse_ext.keypart=part;
578	keyuse_ext.keypart_map= (key_part_map) `1` << part;
579	keyuse_ext.used_tables= keyuse_ext.val->used_tables();
580	keyuse_ext.optimize= added_key_field->optimize & KEY_OPTIMIZE_REF_OR_NULL;
581	keyuse_ext.ref_table_rows= `0`;
582	keyuse_ext.null_rejecting= added_key_field->null_rejecting;
583	keyuse_ext.cond_guard= added_key_field->cond_guard;
584	keyuse_ext.sj_pred_no= added_key_field->sj_pred_no;
585	keyuse_ext.validity_ref= `0`;
586	keyuse_ext.needed_in_prefix= added_key_field->val->used_tables();
587	keyuse_ext.validity_var= false;
588	return ext_keyuses->push(keyuse_ext);
589	}
590
591
592	static int
593	sort_ext_keyuse(KEYUSE_EXT a, KEYUSE_EXT b)
594	{
595	if (a->table->tablenr != b->table->tablenr)
596	return (int) (a->table->tablenr - b->table->tablenr);
597	if (a->key != b->key)
598	return (int) (a->key - b->key);
599	return (int) (a->keypart - b->keypart);
600	}
601
602
603	static void
604	sort_ext_keyuses(Dynamic_array<KEYUSE_EXT> *keyuses)
605	{
606	KEYUSE_EXT *first_keyuse= &keyuses->at(`0`);
607	my_qsort(first_keyuse, keyuses->elements(), sizeof(KEYUSE_EXT),
608	(qsort_cmp) sort_ext_keyuse);
609	}
610
611
612	/**
613	@brief
614	Add info on keyuses usable for splitting into an array
615	*/
616
617	static bool
618	add_ext_keyuses_for_splitting_field(Dynamic_array<KEYUSE_EXT> *ext_keyuses,
619	KEY_FIELD *added_key_field)
620	{
621	Field *field= added_key_field->field;
622	TABLE *table= field->table;
623	for (uint key= `0`; key < table->s->keys; key++)
624	{
625	if (!(table->keys_usable_for_splitting.is_set(key)))
626	continue;
627	KEY *key_info= table->key_info + key;
628	uint key_parts= table->actual_n_key_parts(key_info);
629	KEY_PART_INFO *key_part_info= key_info->key_part;
630	for (uint part=`0`; part < key_parts; part++, key_part_info++)
631	{
632	if (!field->eq(key_part_info->field))
633	continue;
634	if (add_ext_keyuse_for_splitting(ext_keyuses, added_key_field, key, part))
635	return true;
636	}
637	}
638	return false;
639	}
640
641
642	/*
643	@brief
644	Cost of the post join operation used in specification of splittable table
645	*/
646
647	static
648	double spl_postjoin_oper_cost(THD thd, double* join_record_count, uint rec_len)
649	{
650	double cost;
651	cost= get_tmp_table_write_cost(thd, join_record_count,rec_len) *
652	join_record_count; // cost to fill tmp table
653	cost+= get_tmp_table_lookup_cost(thd, join_record_count,rec_len) *
654	join_record_count; // cost to perform post join operation used here
655	cost+= get_tmp_table_lookup_cost(thd, join_record_count, rec_len) +
656	(join_record_count == `0` ? `0` :
657	join_record_count * log2 (join_record_count)) *
658	SORT_INDEX_CMP_COST; // cost to perform sorting
659	return cost;
660	}
661
662	/**
663	@brief
664	Add KEYUSE structures that can be usable for splitting
665
666	@details
667	This function is called only for joins created for potentially
668	splittable materialized tables. The function does the following:
669	1. Creates the dynamic array ext_keyuses_for_splitting of KEYUSE_EXT
670	structures and fills is with info about all keyuses that
671	could be used for splitting.
672	2. Sort the array ext_keyuses_for_splitting for fast access by key
673	on certain columns.
674	3. Collects and stores cost and cardinality info on the best execution
675	plan that does not use splitting and save this plan together with
676	corresponding array of keyuses.
677	4. Expand this array with KEYUSE elements built from the info stored
678	in ext_keyuses_for_splitting that could be produced by pushed
679	equalities employed for splitting.
680	5. Prepare the extended array of keyuses to be used in the function
681	best_access_plan()
682	*/
683
684	void JOIN::add_keyuses_for_splitting()
685	{
686	uint i;
687	uint idx;
688	KEYUSE_EXT *keyuse_ext;
689	KEYUSE_EXT keyuse_ext_end;
690	double oper_cost;
691	uint rec_len;
692	uint added_keyuse_count;
693	TABLE *table= select_lex->master_unit()->derived->table;
694	List_iterator_fast<KEY_FIELD> li(spl_opt_info->added_key_fields);
695	KEY_FIELD *added_key_field;
696	if (!spl_opt_info->added_key_fields.elements)
697	goto err;
698	if (!(ext_keyuses_for_splitting= new Dynamic_array<KEYUSE_EXT>))
699	goto err;
700	while ((added_key_field= li ++))
701	{
702	(void) add_ext_keyuses_for_splitting_field(ext_keyuses_for_splitting,
703	added_key_field);
704	}
705	added_keyuse_count= (uint)ext_keyuses_for_splitting->elements();
706	if (!added_keyuse_count)
707	goto err;
708	sort_ext_keyuses(ext_keyuses_for_splitting);
709	bzero((char) &keyuse_ext_end, sizeof*(keyuse_ext_end));
710	if (ext_keyuses_for_splitting->push(keyuse_ext_end))
711	goto err;
712
713	spl_opt_info->unsplit_card= join_record_count;
714
715	rec_len= table->s->rec_buff_length;
716
717	oper_cost= spl_postjoin_oper_cost(thd, join_record_count, rec_len);
718
719	spl_opt_info->unsplit_cost= best_positions[table_count-`1`].read_time +
720	oper_cost;
721
722	if (!(save_qep= new Join_plan_state (table_count + `1`)))
723	goto err;
724
725	save_query_plan(save_qep);
726
727	if (!keyuse.buffer &&
728	my_init_dynamic_array(&keyuse, sizeof(KEYUSE), `20`, `64`,
729	MYF(MY_THREAD_SPECIFIC)))
730	goto err;
731
732	if (allocate_dynamic(&keyuse,
733	save_qep->keyuse.elements +
734	added_keyuse_count))
735	goto err;
736
737	memcpy(keyuse.buffer,
738	save_qep->keyuse.buffer,
739	(size_t) save_qep->keyuse.elements * keyuse.size_of_element);
740	keyuse.elements= save_qep->keyuse.elements;
741
742	keyuse_ext= &ext_keyuses_for_splitting->at(`0`);
743	idx= save_qep->keyuse.elements;
744	for (i=`0`; i < added_keyuse_count; i++, keyuse_ext++, idx++)
745	{
746	set_dynamic(&keyuse, (KEYUSE *) keyuse_ext, idx);
747	KEYUSE added_keyuse= ((KEYUSE ) (keyuse.buffer)) + idx;
748	added_keyuse->validity_ref= &keyuse_ext->validity_var;
749	}
750
751	if (sort_and_filter_keyuse(thd, &keyuse, true))
752	goto err;
753	optimize_keyuse(this, &keyuse);
754
755	for (uint i= `0`; i < table_count; i++)
756	{
757	JOIN_TAB *tab= join_tab + i;
758	map2table[tab->table->tablenr]= tab;
759	}
760
761	return;
762
763	err:
764	if (save_qep)
765	restore_query_plan(save_qep);
766	table->deny_splitting();
767	return;
768	}
769
770
771	/**
772	@brief
773	Add KEYUSE structures that can be usable for splitting of this joined table
774	*/
775
776	void JOIN_TAB::add_keyuses_for_splitting()
777	{
778	DBUG_ASSERT(table->spl_opt_info != NULL);
779	SplM_opt_info *spl_opt_info= table->spl_opt_info;
780	spl_opt_info->join->add_keyuses_for_splitting();
781	}
782
783
784	/*
785	@brief
786	Find info on the splitting plan by the splitting key
787	*/
788
789	SplM_plan_info SplM_opt_info::find_plan(TABLE table, uint key, uint parts)
790	{
791	List_iterator_fast<SplM_plan_info> li(plan_cache);
792	SplM_plan_info *spl_plan;
793	while ((spl_plan= li ++))
794	{
795	if (spl_plan->table == table &&
796	spl_plan->key == key &&
797	spl_plan->parts == parts)
798	break;
799	}
800	return spl_plan;
801	}
802
803
804	/*
805	@breaf
806	Enable/Disable a keyuses that can be used for splitting
807	*/
808
809	static
810	void reset_validity_vars_for_keyuses(KEYUSE_EXT *key_keyuse_ext_start,
811	TABLE *table, uint key,
812	table_map remaining_tables,
813	bool validity_val)
814	{
815	KEYUSE_EXT *keyuse_ext= key_keyuse_ext_start;
816	do
817	{
818	if (!(keyuse_ext->needed_in_prefix & remaining_tables))
819	{
820	/*
821	The enabling/disabling flags are set just in KEYUSE_EXT structures.
822	Yet keyuses that are used by best_access_path() have pointers
823	to these flags.
824	*/
825	keyuse_ext->validity_var= validity_val;
826	}
827	keyuse_ext++;
828	}
829	while (keyuse_ext->key == key && keyuse_ext->table == table);
830	}
831
832
833	/**
834	@brief
835	Choose the best splitting to extend the evaluated partial join
836
837	@param
838	record_count estimated cardinality of the extended partial join
839	remaining_tables tables not joined yet
840
841	@details
842	This function is called during the search for the best execution
843	plan of the join that contains this table T. The function is called
844	every time when the optimizer tries to extend a partial join by
845	joining it with table T. Depending on what tables are already in the
846	partial join different equalities usable for splitting can be pushed
847	into T. The function evaluates different variants and chooses the
848	best one. Then the function finds the plan for the materializing join
849	with the chosen equality conditions pushed into it. If the cost of the
850	plan turns out to be less than the cost of the best plan without
851	splitting the function set it as the true plan of materialization
852	of the table T.
853	The function caches the found plans for materialization of table T
854	together if the info what key was used for splitting. Next time when
855	the optimizer prefers to use the same key the plan is taken from
856	the cache of plans
857
858	@retval
859	Pointer to the info on the found plan that employs the pushed equalities
860	if the plan has been chosen, NULL - otherwise.
861	*/
862
863	SplM_plan_info * JOIN_TAB::choose_best_splitting(double record_count,
864	table_map remaining_tables)
865	{
866	SplM_opt_info *spl_opt_info= table->spl_opt_info;
867	DBUG_ASSERT(spl_opt_info != NULL);
868	JOIN *join= spl_opt_info->join;
869	THD *thd= join->thd;
870	table_map tables_usable_for_splitting=
871	spl_opt_info->tables_usable_for_splitting;
872	KEYUSE_EXT *keyuse_ext= &join->ext_keyuses_for_splitting->at(`0`);
873	KEYUSE_EXT *UNINIT_VAR(best_key_keyuse_ext_start);
874	TABLE *best_table= `0`;
875	double best_rec_per_key= DBL_MAX;
876	SplM_plan_info *spl_plan= `0`;
877	uint best_key= `0`;
878	uint best_key_parts= `0`;
879
880	/*
881	Check whether there are keys that can be used to join T employing splitting
882	and if so, select the best out of such keys
883	*/
884	for (uint tablenr= `0`; tablenr < join->table_count; tablenr++)
885	{
886	if (!((`1ULL` << tablenr) & tables_usable_for_splitting))
887	continue;
888	JOIN_TAB *tab= join->map2table[tablenr];
889	TABLE *table= tab->table;
890	do
891	{
892	uint key= keyuse_ext->key;
893	KEYUSE_EXT *key_keyuse_ext_start= keyuse_ext;
894	key_part_map found_parts= `0`;
895	do
896	{
897	if (keyuse_ext->needed_in_prefix & remaining_tables)
898	{
899	keyuse_ext++;
900	continue;
901	}
902	if (!(keyuse_ext->keypart_map & found_parts))
903	{
904	if ((!found_parts && !keyuse_ext->keypart) \|\|
905	(found_parts && ((keyuse_ext->keypart_map >> `1`) & found_parts)))
906	found_parts\|= keyuse_ext->keypart_map;
907	else
908	{
909	do
910	{
911	keyuse_ext++;
912	}
913	while (keyuse_ext->key == key && keyuse_ext->table == table);
914	break;
915	}
916	}
917	KEY *key_info= table->key_info + key;
918	double rec_per_key=
919	key_info->actual_rec_per_key(keyuse_ext->keypart);
920	if (rec_per_key < best_rec_per_key)
921	{
922	best_table= keyuse_ext->table;
923	best_key= keyuse_ext->key;
924	best_key_parts= keyuse_ext->keypart + `1`;
925	best_rec_per_key= rec_per_key;
926	best_key_keyuse_ext_start= key_keyuse_ext_start;
927	}
928	keyuse_ext++;
929	}
930	while (keyuse_ext->key == key && keyuse_ext->table == table);
931	}
932	while (keyuse_ext->table == table);
933	}
934	spl_opt_info->last_plan= `0`;
935	if (best_table)
936	{
937	/*
938	The key for splitting was chosen, look for the plan for this key
939	in the cache
940	*/
941	spl_plan= spl_opt_info->find_plan(best_table, best_key, best_key_parts);
942	if (!spl_plan &&
943	(spl_plan= (SplM_plan_info ) thd->alloc(sizeof*(SplM_plan_info))) &&
944	(spl_plan->best_positions=
945	(POSITION ) thd->alloc(sizeof(POSITION) join->table_count)) &&
946	!spl_opt_info->plan_cache.push_back(spl_plan))
947	{
948	/*
949	The plan for the chosen key has not been found in the cache.
950	Build a new plan and save info on it in the cache
951	*/
952	table_map all_table_map= (`1` << join->table_count) - `1`;
953	reset_validity_vars_for_keyuses(best_key_keyuse_ext_start, best_table,
954	best_key, remaining_tables, true);
955	choose_plan(join, all_table_map & ~join->const_table_map);
956	spl_plan->keyuse_ext_start= best_key_keyuse_ext_start;
957	spl_plan->table= best_table;
958	spl_plan->key= best_key;
959	spl_plan->parts= best_key_parts;
960	spl_plan->split_sel= best_rec_per_key /
961	(spl_opt_info->unsplit_card ?
962	spl_opt_info->unsplit_card : `1`);
963
964	uint rec_len= table->s->rec_buff_length;
965
966	double split_card= spl_opt_info->unsplit_card * spl_plan->split_sel;
967	double oper_cost= split_card *
968	spl_postjoin_oper_cost(thd, split_card, rec_len);
969	spl_plan->cost= join->best_positions[join->table_count-`1`].read_time +
970	+ oper_cost;
971
972	memcpy((char *) spl_plan->best_positions,
973	(char *) join->best_positions,
974	sizeof(POSITION) * join->table_count);
975	reset_validity_vars_for_keyuses(best_key_keyuse_ext_start, best_table,
976	best_key, remaining_tables, false);
977	}
978	if (spl_plan)
979	{
980	if(record_count * spl_plan->cost < spl_opt_info->unsplit_cost)
981	{
982	/*
983	The best plan that employs splitting is cheaper than
984	the plan without splitting
985	*/
986	spl_opt_info->last_plan= spl_plan;
987	}
988	}
989	}
990
991	/ Set the cost of the preferred materialization for this partial join /
992	records= (ha_rows)spl_opt_info->unsplit_card;
993	spl_plan= spl_opt_info->last_plan;
994	if (spl_plan)
995	{
996	startup_cost= record_count * spl_plan->cost;
997	records= (ha_rows) (records * spl_plan->split_sel);
998	}
999	else
1000	startup_cost= spl_opt_info->unsplit_cost;
1001	return spl_plan;
1002	}
1003
1004
1005	/**
1006	@brief
1007	Inject equalities for splitting used by the materialization join
1008
1009	@param
1010	remaining_tables used to filter out the equalities that cannot
1011	be pushed.
1012
1013	@details
1014	This function is called by JOIN_TAB::fix_splitting that is used
1015	to fix the chosen splitting of a splittable materialized table T
1016	in the final query execution plan. In this plan the table T
1017	is joined just before the 'remaining_tables'. So all equalities
1018	usable for splitting whose right parts do not depend on any of
1019	remaining tables can be pushed into join for T.
1020	The function also marks the select that specifies T as
1021	UNCACHEABLE_DEPENDENT_INJECTED.
1022
1023	@retval
1024	false on success
1025	true on failure
1026	*/
1027
1028	bool JOIN::inject_best_splitting_cond(table_map remaining_tables)
1029	{
1030	Item *inj_cond= `0`;
1031	List<Item> inj_cond_list;
1032	List_iterator<KEY_FIELD> li(spl_opt_info->added_key_fields);
1033	KEY_FIELD *added_key_field;
1034	while ((added_key_field= li ++))
1035	{
1036	if (remaining_tables & added_key_field->val->used_tables())
1037	continue;
1038	if (inj_cond_list.push_back(added_key_field->cond, thd->mem_root))
1039	return true;
1040	}
1041	DBUG_ASSERT(inj_cond_list.elements);
1042	switch (inj_cond_list.elements) {
1043	case `1`:
1044	inj_cond= inj_cond_list.head(); break;
1045	default:
1046	inj_cond= new (thd->mem_root) Item_cond_and (thd, inj_cond_list);
1047	if (!inj_cond)
1048	return true;
1049	}
1050	if (inj_cond)
1051	inj_cond->fix_fields(thd,`0`);
1052
1053	if (inject_cond_into_where(inj_cond))
1054	return true;
1055
1056	select_lex->uncacheable\|= UNCACHEABLE_DEPENDENT_INJECTED;
1057	st_select_lex_unit *unit= select_lex->master_unit();
1058	unit->uncacheable\|= UNCACHEABLE_DEPENDENT_INJECTED;
1059
1060	return false;
1061	}
1062
1063
1064	/**
1065	@brief
1066	Fix the splitting chosen for a splittable table in the final query plan
1067
1068	@param
1069	spl_plan info on the splitting plan chosen for the splittable table T
1070	remaining_tables the table T is joined just before these tables
1071
1072	@details
1073	If in the final query plan the optimizer has chosen a splitting plan
1074	then the function sets this plan as the final execution plan to
1075	materialized the table T. Otherwise the plan that does not use
1076	splitting is set for the materialization.
1077
1078	@retval
1079	false on success
1080	true on failure
1081	*/
1082
1083	bool JOIN_TAB::fix_splitting(SplM_plan_info *spl_plan,
1084	table_map remaining_tables)
1085	{
1086	SplM_opt_info *spl_opt_info= table->spl_opt_info;
1087	DBUG_ASSERT(table->spl_opt_info != `0`);
1088	JOIN *md_join= spl_opt_info->join;
1089	if (spl_plan)
1090	{
1091	memcpy((char *) md_join->best_positions,
1092	(char *) spl_plan->best_positions,
1093	sizeof(POSITION) * md_join->table_count);
1094	if (md_join->inject_best_splitting_cond(remaining_tables))
1095	return true;
1096	/*
1097	This is called for a proper work of JOIN::get_best_combination()
1098	called for the join that materializes T
1099	*/
1100	reset_validity_vars_for_keyuses(spl_plan->keyuse_ext_start,
1101	spl_plan->table,
1102	spl_plan->key,
1103	remaining_tables,
1104	true);
1105	}
1106	else
1107	{
1108	md_join->restore_query_plan(md_join->save_qep);
1109	}
1110	return false;
1111	}
1112
1113
1114	/**
1115	@brief
1116	Fix the splittings chosen splittable tables in the final query plan
1117
1118	@details
1119	The function calls JOIN_TAB::fix_splittins for all potentially
1120	splittable tables in this join to set all final materialization
1121	plans chosen for these tables.
1122
1123	@retval
1124	false on success
1125	true on failure
1126	*/
1127
1128	bool JOIN::fix_all_splittings_in_plan()
1129	{
1130	table_map prev_tables= `0`;
1131	table_map all_tables= (`1` << table_count) - `1`;
1132	for (uint tablenr= `0`; tablenr < table_count; tablenr++)
1133	{
1134	POSITION *cur_pos= &best_positions[tablenr];
1135	JOIN_TAB *tab= cur_pos->table;
1136	if (tablenr >= const_tables && tab->table->is_splittable())
1137	{
1138	SplM_plan_info *spl_plan= cur_pos->spl_plan;
1139	if (tab->fix_splitting(spl_plan, all_tables & ~prev_tables))
1140	return true;
1141	}
1142	prev_tables\|= tab->table->map;
1143	}
1144	return false;
1145	}
1146

Browse the source code of MariaDB/sql/opt_split.cc