| 1 | /* |
| 2 | * This Source Code Form is subject to the terms of the Mozilla Public |
| 3 | * License, v. 2.0. If a copy of the MPL was not distributed with this |
| 4 | * file, You can obtain one at http://mozilla.org/MPL/2.0/. |
| 5 | * |
| 6 | * Copyright 1997 - July 2008 CWI, August 2008 - 2019 MonetDB B.V. |
| 7 | */ |
| 8 | |
| 9 | /* |
| 10 | * @a M. Kersten |
| 11 | * @* Scheduler framework |
| 12 | * A key design decision for this MonetDB version is to create a clean |
| 13 | * processing pipeline: compile-, optimize-, schedule- and execute. |
| 14 | * The MAL scheduler controls the actual execution of a program through a number of |
| 15 | * techniques described here. It accepts a program with the intend to execute it |
| 16 | * as soon as possible with the best possible behavior. |
| 17 | * |
| 18 | * Note that the scheduler should only work on temporary plans. Keeping an |
| 19 | * execution schedule in the symbol table calls for a sound administration |
| 20 | * to invalidate it. This assumed that such decisions are taken by a cooperative |
| 21 | * front-end compiler, e.g. SQL. For the remainder we assume a volatile MAL |
| 22 | * program is available. |
| 23 | * |
| 24 | * Recall that the MAL optimizer already had an opportunity to massage the program |
| 25 | * into something that has a better execution behavior. These optimizations |
| 26 | * necessarily rely on static (or stale) information, such as histograms, indices, |
| 27 | * and rewrite heuristics. It is the task of the scheduler to achieve this |
| 28 | * performance objective (or better), while at the same time guaranteeing |
| 29 | * great throughput. |
| 30 | * |
| 31 | * Unlike the optimizer, the scheduler has access to the calling parameters, |
| 32 | * runtime stack for the about-to-be-executed plan, and system resources, |
| 33 | * such as globally cached results. Moreover, a scheduler can be called at any time |
| 34 | * during a MAL interpreter loop to re-consider the actions to be taken. |
| 35 | * This gives many more opportunities to improve performance. |
| 36 | * This does not hold for e.g. compiled code. |
| 37 | * |
| 38 | * Much like the optimizer framework, there are many aspects you can deal with |
| 39 | * at scheduling time. The easiest is to simply do nothing and rely on |
| 40 | * the MAL interpreter, which is the default action when no scheduler directive |
| 41 | * is found. |
| 42 | * |
| 43 | * In most cases, however, you would like a scheduler to take a specific action. |
| 44 | * It typically rewrites, extends, or reshuffles the query plan as it is being |
| 45 | * prepared for execution or even during its execution. |
| 46 | * In this process it can use the optimizer infrastructure, provided the |
| 47 | * changes are isolated to a private instance. This should be prepared |
| 48 | * before the interpreter is started and requires care during execution to |
| 49 | * ensure proper behaviour. The upfront cost attached is to make a complete |
| 50 | * copy of the program code and to assure that the definitions in the |
| 51 | * symbol table remain intact. |
| 52 | * |
| 53 | * To make its life easy, it is assumed that all scheduler decisions to be |
| 54 | * taken before the back-end is called reside in the module 'scheduler'. |
| 55 | * Any action to be taken during execution is kept in module 'rescheduler'. |
| 56 | * |
| 57 | * What scheduler operators are provided? |
| 58 | * |
| 59 | * The generic scheduler presented here shows a series of techniques that |
| 60 | * might be useful in construction of your own scheduler. |
| 61 | * |
| 62 | * @emph{Run isolation (RUNisolated)} |
| 63 | * Goal: to isolate changes to the query plan from others |
| 64 | * Rationale: A scheduler may change the order, and actual function calls. |
| 65 | * These changes should be confined to a single execution. The next time around |
| 66 | * there may be a different situation to take care off. This is achieved by |
| 67 | * replacing the current program with a private copy. |
| 68 | * Note that this process may involve a deep analysis to identify the pieces |
| 69 | * needed for isolation, e.g. for SQL we would extract a copy from the SQL query |
| 70 | * cache using code expansion. |
| 71 | * Just massaging the call to the cached plan is not effective. |
| 72 | * Impact: cost of program copying may become an issue. |
| 73 | * |
| 74 | * @emph{Run trace (RUNtracing)} |
| 75 | * Goal: to collect performance data for either direct monitoring or post-analysis |
| 76 | * Rationale:The performance profiling option in the interpreter is overly expensive, |
| 77 | * because it is also called for simple statements. One way out of this, is to |
| 78 | * inject specific performance metric calls at places where it counts. |
| 79 | * For example, you could wrap calls calls to a specific module or operation. |
| 80 | * |
| 81 | * @emph{Run notification (RUNnotification)} |
| 82 | * Goal: to inject notification calls to awaiting servers |
| 83 | * Rationale: Operations on the database may require side effects to take place, |
| 84 | * such as activating a trigger (implemented as a factory) |
| 85 | * |
| 86 | * @emph{Update notification (RUNupdateNotification)} |
| 87 | * Goal: Updates to the Boxes may have to forwarded to stand-by replicas. |
| 88 | * Rationale: actually a refinement of the notification scheme, but geared at |
| 89 | * maintaining a replicated system. |
| 90 | * |
| 91 | * @emph{Materialized Execution Scheduler (MEscheduler)} |
| 92 | * Goal:to transfor the program into fragment-based processing |
| 93 | * Rationale: to reduce the impact of resource limitations, such |
| 94 | * as main memory, in the face of materialized intermediate. |
| 95 | * |
| 96 | * @emph{Run parallelization (RUNparallel)} |
| 97 | * Goal: to improve throughput/response time by exploiting parallel hardware |
| 98 | * |
| 99 | * @emph{Run stepping RUNstepping} |
| 100 | * goal: to return to the scheduler at specific points in the program |
| 101 | * execution. For example to re-consider the scheduling actions. |
| 102 | * Note that this requires a scheme to 'backtrack' in the scenario |
| 103 | * |
| 104 | * @emph{Cache management (RUNbatCaching} |
| 105 | * Goal: to capture expensive BAT operations and to keep them around for |
| 106 | * re-use. |
| 107 | * |
| 108 | * @emph{JIT compilation} |
| 109 | * Use the Mcc just in time compilation feature to derive and link |
| 110 | * a small C-program. |
| 111 | * |
| 112 | * What the scheduler could do more? |
| 113 | * A separete thread of control can be used to administer the |
| 114 | * information needed by concurrent scheduler calls. |
| 115 | * |
| 116 | * One of the areas to be decided upon is whether the scheduler is |
| 117 | * also the place to manage the stack spaces. It certainly could |
| 118 | * reconcile parallel executions by inspection of the stack. |
| 119 | * |
| 120 | * Upon starting the scheduler decisions, we should (regularly) |
| 121 | * refresh our notion on the availability of resources. |
| 122 | * This is kept around in a global structure and is the basis for |
| 123 | * micro scheduling decisions. |
| 124 | */ |
| 125 | |
| 126 | #include "monetdb_config.h" |
| 127 | #include "run_pipeline.h" |
| 128 | #include "mal_interpreter.h" /* for showErrors() */ |
| 129 | #include "opt_prelude.h" |
| 130 | #include "opt_macro.h" |
| 131 | |
| 132 | /* |
| 133 | * The implementation approach of the scheduler aligns with that of the |
| 134 | * optimizer. We look for specific scheduler module calls and act accordingly. |
| 135 | * The result should be a MAL block that can be executed by the corresponding |
| 136 | * engine. |
| 137 | */ |
| 138 | /* |
| 139 | * The second example is derived from the SQL environment, which |
| 140 | * produces two MAL functions: one (already) stored in the query cache |
| 141 | * and a program to be executed in the client record. |
| 142 | * The latter lives only for the duration of the call and need |
| 143 | * not be safeguarded. Thus, we immediately call the scheduler |
| 144 | * to expand the code from the query cache, propagating the |
| 145 | * actual parameters. |
| 146 | * The SQL specific scheduler is called to bind the table |
| 147 | * columns and to make statistics known as properties. |
| 148 | * This creates a situation where a cost-based optimizer could |
| 149 | * step in and re-arrange the plan, which is ignored for now. |
| 150 | * @example |
| 151 | * function sql_cache.qry01(int A1, bit A2); |
| 152 | * t1:= sql.bind("schema","tables",0,0); |
| 153 | * ... |
| 154 | * end qry01; |
| 155 | * |
| 156 | * function main(); |
| 157 | * scheduler.inline("sql_cache","qry01"); |
| 158 | * scheduler.sqlbind(); |
| 159 | * sql_cache.qry01(12,false); |
| 160 | * @end example |
| 161 | * |
| 162 | * These scheduler interventions are called for by the SQL compiler. |
| 163 | * It may have flagged the cached version as a high potential for |
| 164 | * dynamic optimization, something that is not known at the MAL level. |
| 165 | * [The current optimizer also generates a factory, that should be |
| 166 | * done here and upon explicit request] |
| 167 | */ |
| 168 | #if 0 |
| 169 | str |
| 170 | RUNinline(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr p) |
| 171 | { |
| 172 | Symbol qc; |
| 173 | str modnme = getVarConstant(mb, getArg(p, 1)).val.sval; |
| 174 | str fcnnme = getVarConstant(mb, getArg(p, 2)).val.sval; |
| 175 | |
| 176 | (void) stk; |
| 177 | (void) p; |
| 178 | qc = findSymbol(cntxt ->nspace, getName(modnme), |
| 179 | putName(fcnnme)); |
| 180 | |
| 181 | if (qc) |
| 182 | MACROprocessor(cntxt, mb, qc); |
| 183 | |
| 184 | return MAL_SUCCEED; |
| 185 | } |
| 186 | #endif |
| 187 | |
| 188 | /* |
| 189 | * @- |
| 190 | * The SQL scheduler is presented merely as an example. The real one |
| 191 | * is located in the SQL libraries for it needs specific information. |
| 192 | */ |
| 193 | #if 0 |
| 194 | str |
| 195 | RUNsqlbind(Client cntxt, MalBlkPtr mb, MalStkPtr stk, InstrPtr p) |
| 196 | { |
| 197 | int i; |
| 198 | str msg = MAL_SUCCEED; |
| 199 | Symbol sqlbind = findSymbol(cntxt ->nspace, getName("sql" ), getName("bind" )); |
| 200 | MALfcn f = NULL; |
| 201 | |
| 202 | if (sqlbind ) |
| 203 | f = getSignature(sqlbind)->fcn; |
| 204 | |
| 205 | if ( f ) |
| 206 | for (i = 0; i < mb->stop; i++) { |
| 207 | p = getInstrPtr(mb, i); |
| 208 | if (p->fcn == f) { |
| 209 | if ((msg = reenterMAL(cntxt, mb, i, i + 1, stk))) |
| 210 | break; |
| 211 | /* fetch the BAT properties and turn off this instruction */ |
| 212 | p->token = NOOPsymbol; |
| 213 | } |
| 214 | } |
| 215 | #ifdef DEBUG_MAL_SCHEDULER |
| 216 | fprintf(stderr, "scheduler.sqlbind results\n" ); |
| 217 | fprintFunction(stderr, mb, stk, LIST_MAL_ALL); |
| 218 | #endif |
| 219 | return msg; |
| 220 | } |
| 221 | #endif |
| 222 | |
| 223 | |