60 |
|
* convenient form for informal monitoring. |
61 |
|
* |
62 |
|
* <p> As is the case with other ExecutorServices, there are three |
63 |
< |
* main task execution methods summarized in the following |
64 |
< |
* table. These are designed to be used primarily by clients not |
65 |
< |
* already engaged in fork/join computations in the current pool. The |
66 |
< |
* main forms of these methods accept instances of {@code |
67 |
< |
* ForkJoinTask}, but overloaded forms also allow mixed execution of |
68 |
< |
* plain {@code Runnable}- or {@code Callable}- based activities as |
69 |
< |
* well. However, tasks that are already executing in a pool should |
70 |
< |
* normally instead use the within-computation forms listed in the |
71 |
< |
* table unless using async event-style tasks that are not usually |
72 |
< |
* joined, in which case there is little difference among choice of |
73 |
< |
* methods. |
63 |
> |
* main task execution methods summarized in the following table. |
64 |
> |
* These are designed to be used primarily by clients not already |
65 |
> |
* engaged in fork/join computations in the current pool. The main |
66 |
> |
* forms of these methods accept instances of {@code ForkJoinTask}, |
67 |
> |
* but overloaded forms also allow mixed execution of plain {@code |
68 |
> |
* Runnable}- or {@code Callable}- based activities as well. However, |
69 |
> |
* tasks that are already executing in a pool should normally instead |
70 |
> |
* use the within-computation forms listed in the table unless using |
71 |
> |
* async event-style tasks that are not usually joined, in which case |
72 |
> |
* there is little difference among choice of methods. |
73 |
|
* |
74 |
|
* <table BORDER CELLPADDING=3 CELLSPACING=1> |
75 |
|
* <tr> |
130 |
|
* |
131 |
|
* This class and its nested classes provide the main |
132 |
|
* functionality and control for a set of worker threads: |
133 |
< |
* Submissions from non-FJ threads enter into submission |
134 |
< |
* queues. Workers take these tasks and typically split them into |
135 |
< |
* subtasks that may be stolen by other workers. Preference rules |
136 |
< |
* give first priority to processing tasks from their own queues |
137 |
< |
* (LIFO or FIFO, depending on mode), then to randomized FIFO |
138 |
< |
* steals of tasks in other queues. |
133 |
> |
* Submissions from non-FJ threads enter into submission queues. |
134 |
> |
* Workers take these tasks and typically split them into subtasks |
135 |
> |
* that may be stolen by other workers. Preference rules give |
136 |
> |
* first priority to processing tasks from their own queues (LIFO |
137 |
> |
* or FIFO, depending on mode), then to randomized FIFO steals of |
138 |
> |
* tasks in other queues. |
139 |
|
* |
140 |
< |
* WorkQueues. |
140 |
> |
* WorkQueues |
141 |
|
* ========== |
142 |
|
* |
143 |
|
* Most operations occur within work-stealing queues (in nested |
155 |
|
* (http://research.sun.com/scalable/pubs/index.html) and |
156 |
|
* "Idempotent work stealing" by Michael, Saraswat, and Vechev, |
157 |
|
* PPoPP 2009 (http://portal.acm.org/citation.cfm?id=1504186). |
158 |
< |
* The main differences ultimately stem from gc requirements that |
158 |
> |
* The main differences ultimately stem from GC requirements that |
159 |
|
* we null out taken slots as soon as we can, to maintain as small |
160 |
|
* a footprint as possible even in programs generating huge |
161 |
|
* numbers of tasks. To accomplish this, we shift the CAS |
187 |
|
* rarely provide the best possible performance on a given |
188 |
|
* machine, but portably provide good throughput by averaging over |
189 |
|
* these factors. (Further, even if we did try to use such |
190 |
< |
* information, we do not usually have a basis for exploiting |
191 |
< |
* it. For example, some sets of tasks profit from cache |
192 |
< |
* affinities, but others are harmed by cache pollution effects.) |
190 |
> |
* information, we do not usually have a basis for exploiting it. |
191 |
> |
* For example, some sets of tasks profit from cache affinities, |
192 |
> |
* but others are harmed by cache pollution effects.) |
193 |
|
* |
194 |
|
* WorkQueues are also used in a similar way for tasks submitted |
195 |
|
* to the pool. We cannot mix these tasks in the same queues used |
211 |
|
* because submitters encountering a busy queue try or create |
212 |
|
* others so never block. |
213 |
|
* |
214 |
< |
* Management. |
214 |
> |
* Management |
215 |
|
* ========== |
216 |
|
* |
217 |
|
* The main throughput advantages of work-stealing stem from |
221 |
|
* tactic for avoiding bottlenecks is packing nearly all |
222 |
|
* essentially atomic control state into two volatile variables |
223 |
|
* that are by far most often read (not written) as status and |
224 |
< |
* consistency checks |
224 |
> |
* consistency checks. |
225 |
|
* |
226 |
|
* Field "ctl" contains 64 bits holding all the information needed |
227 |
|
* to atomically decide to add, inactivate, enqueue (on an event |
300 |
|
* some other queued worker rather than itself, which has the same |
301 |
|
* net effect. Because enqueued workers may actually be rescanning |
302 |
|
* rather than waiting, we set and clear the "parker" field of |
303 |
< |
* Workqueues to reduce unnecessary calls to unpark. (This |
303 |
> |
* WorkQueues to reduce unnecessary calls to unpark. (This |
304 |
|
* requires a secondary recheck to avoid missed signals.) Note |
305 |
|
* the unusual conventions about Thread.interrupts surrounding |
306 |
|
* parking and other blocking: Because interrupts are used solely |
328 |
|
* terminating all workers after long periods of non-use. |
329 |
|
* |
330 |
|
* Shutdown and Termination. A call to shutdownNow atomically sets |
331 |
< |
* a runState bit and then (non-atomically) sets each workers |
331 |
> |
* a runState bit and then (non-atomically) sets each worker's |
332 |
|
* runState status, cancels all unprocessed tasks, and wakes up |
333 |
|
* all waiting workers. Detecting whether termination should |
334 |
|
* commence after a non-abrupt shutdown() call requires more work |
337 |
|
* indication but non-abrupt shutdown still requires a rechecking |
338 |
|
* scan for any workers that are inactive but not queued. |
339 |
|
* |
340 |
< |
* Joining Tasks. |
341 |
< |
* ============== |
340 |
> |
* Joining Tasks |
341 |
> |
* ============= |
342 |
|
* |
343 |
|
* Any of several actions may be taken when one worker is waiting |
344 |
< |
* to join a task stolen (or always held by) another. Because we |
344 |
> |
* to join a task stolen (or always held) by another. Because we |
345 |
|
* are multiplexing many tasks on to a pool of workers, we can't |
346 |
|
* just let them block (as in Thread.join). We also cannot just |
347 |
|
* reassign the joiner's run-time stack with another and replace |
348 |
|
* it later, which would be a form of "continuation", that even if |
349 |
|
* possible is not necessarily a good idea since we sometimes need |
350 |
< |
* both an unblocked task and its continuation to |
351 |
< |
* progress. Instead we combine two tactics: |
350 |
> |
* both an unblocked task and its continuation to progress. |
351 |
> |
* Instead we combine two tactics: |
352 |
|
* |
353 |
|
* Helping: Arranging for the joiner to execute some task that it |
354 |
|
* would be running if the steal had not occurred. |
424 |
|
* enable the processing of billions of tasks per second, at the |
425 |
|
* expense of some ugliness. |
426 |
|
* |
427 |
< |
* Methods signalWork() and scan() are the main bottlenecks so are |
427 |
> |
* Methods signalWork() and scan() are the main bottlenecks, so are |
428 |
|
* especially heavily micro-optimized/mangled. There are lots of |
429 |
|
* inline assignments (of form "while ((local = field) != 0)") |
430 |
|
* which are usually the simplest way to ensure the required read |
436 |
|
* coding oddities that help some methods perform reasonably even |
437 |
|
* when interpreted (not compiled). |
438 |
|
* |
439 |
< |
* The order of declarations in this file is: (1) declarations of |
440 |
< |
* statics (2) fields (along with constants used when unpacking |
441 |
< |
* some of them), listed in an order that tends to reduce |
442 |
< |
* contention among them a bit under most JVMs; (3) nested |
443 |
< |
* classes; (4) internal control methods; (5) callbacks and other |
444 |
< |
* support for ForkJoinTask methods; (6) exported methods (plus a |
445 |
< |
* few little helpers); (7) static block initializing all statics |
446 |
< |
* in a minimally dependent order. |
439 |
> |
* The order of declarations in this file is: |
440 |
> |
* (1) statics |
441 |
> |
* (2) fields (along with constants used when unpacking some of |
442 |
> |
* them), listed in an order that tends to reduce contention |
443 |
> |
* among them a bit under most JVMs; |
444 |
> |
* (3) nested classes |
445 |
> |
* (4) internal control methods |
446 |
> |
* (5) callbacks and other support for ForkJoinTask methods |
447 |
> |
* (6) exported methods (plus a few little helpers) |
448 |
> |
* (7) static block initializing all statics in a minimally |
449 |
> |
* dependent order. |
450 |
|
*/ |
451 |
|
|
452 |
|
/** |
541 |
|
* consistency checks: Staleness of read-only operations on the |
542 |
|
* workers and queues arrays can be checked by comparing runState |
543 |
|
* before vs after the reads. The low 16 bits (i.e, anding with |
544 |
< |
* SMASK) hold (the smallest power of two covering all worker |
544 |
> |
* SMASK) hold the smallest power of two covering all worker |
545 |
|
* indices, minus one. The mask for queues (vs workers) is twice |
546 |
|
* this value plus 1. |
547 |
|
*/ |
734 |
|
} |
735 |
|
|
736 |
|
/** |
737 |
< |
* Returns number of tasks in the queue |
737 |
> |
* Returns number of tasks in the queue. |
738 |
|
*/ |
739 |
|
final int queueSize() { |
740 |
|
int n = base - top; // non-owner callers must read base first |
745 |
|
* Pushes a task. Call only by owner in unshared queues. |
746 |
|
* |
747 |
|
* @param task the task. Caller must ensure non-null. |
748 |
< |
* @param p, if non-null, pool to signal if necessary |
749 |
< |
* @throw RejectedExecutionException if array cannot |
748 |
< |
* be resized |
748 |
> |
* @param p if non-null, pool to signal if necessary |
749 |
> |
* @throw RejectedExecutionException if array cannot be resized |
750 |
|
*/ |
751 |
|
final void push(ForkJoinTask<?> task, ForkJoinPool p) { |
752 |
|
ForkJoinTask<?>[] a; |
981 |
|
} |
982 |
|
|
983 |
|
/** |
984 |
< |
* Removes and cancels all known tasks, ignoring any exceptions |
984 |
> |
* Removes and cancels all known tasks, ignoring any exceptions. |
985 |
|
*/ |
986 |
|
final void cancelAll() { |
987 |
|
ForkJoinTask.cancelIgnoringExceptions(currentJoin); |
1024 |
|
} |
1025 |
|
|
1026 |
|
/** |
1027 |
< |
* Executes a non-top-level (stolen) task |
1027 |
> |
* Executes a non-top-level (stolen) task. |
1028 |
|
*/ |
1029 |
|
final void runSubtask(ForkJoinTask<?> t) { |
1030 |
|
if (t != null) { |
1088 |
|
} |
1089 |
|
|
1090 |
|
/** |
1090 |
– |
<<<<<<< ForkJoinPool.java |
1091 |
|
* Per-thread records for (typically non-FJ) threads that submit |
1092 |
|
* to pools. Cureently holds only psuedo-random seed / index that |
1093 |
< |
* is used to chose submission queues in method doSubmit. In the |
1093 |
> |
* is used to choose submission queues in method doSubmit. In the |
1094 |
|
* future, this may incorporate a means to implement different |
1095 |
|
* task rejection and resubmission policies. |
1096 |
|
*/ |
1175 |
|
|
1176 |
|
/** |
1177 |
|
* Callback from ForkJoinWorkerThread constructor to establish and |
1178 |
< |
* record its WorkQueue |
1178 |
> |
* record its WorkQueue. |
1179 |
|
* |
1180 |
|
* @param wt the worker thread |
1181 |
|
*/ |
1286 |
|
// Maintaining ctl counts |
1287 |
|
|
1288 |
|
/** |
1289 |
< |
* Increments active count; mainly called upon return from blocking |
1289 |
> |
* Increments active count; mainly called upon return from blocking. |
1290 |
|
*/ |
1291 |
|
final void incrementActiveCount() { |
1292 |
|
long c; |
1294 |
|
} |
1295 |
|
|
1296 |
|
/** |
1297 |
< |
* Activates or creates a worker |
1297 |
> |
* Activates or creates a worker. |
1298 |
|
*/ |
1299 |
|
final void signalWork() { |
1300 |
|
/* |
1448 |
|
* Thread.yield to help avoid interference with more useful |
1449 |
|
* activities on the system. |
1450 |
|
* |
1451 |
< |
* * If pool is terminating, terminate the worker |
1451 |
> |
* * If pool is terminating, terminate the worker. |
1452 |
|
* |
1453 |
|
* * If not already enqueued, try to inactivate and enqueue the |
1454 |
|
* worker on wait queue. |
1470 |
|
ForkJoinTask<?> task = null; |
1471 |
|
for (int k = 0, j = -2 - m; ; ++j) { |
1472 |
|
WorkQueue q; int b; |
1473 |
< |
if (j < 0) { // random probes while j negative |
1473 |
> |
if (j < 0) { // random probes while j negative |
1474 |
|
r ^= r << 13; r ^= r >>> 17; k = (r ^= r << 5) | (j & 1); |
1475 |
< |
} // worker (not submit) for odd j |
1476 |
< |
else // cyclic scan when j >= 0 |
1477 |
< |
k += (m >>> 1) | 1; // step by half to reduce bias |
1475 |
> |
} // worker (not submit) for odd j |
1476 |
> |
else // cyclic scan when j >= 0 |
1477 |
> |
k += (m >>> 1) | 1; // step by half to reduce bias |
1478 |
|
|
1479 |
|
if ((q = ws[k & m]) != null && (b = q.base) - q.top < 0) { |
1480 |
|
if (ec >= 0) |
1481 |
< |
task = q.pollAt(b); // steal |
1481 |
> |
task = q.pollAt(b); // steal |
1482 |
|
break; |
1483 |
|
} |
1484 |
|
else if (j > m) { |
1546 |
|
|
1547 |
|
/** |
1548 |
|
* If inactivating worker w has caused pool to become quiescent, |
1549 |
< |
* check for pool termination, and, so long as this is not the |
1550 |
< |
* only worker, wait for event for up to SHRINK_RATE nanosecs On |
1551 |
< |
* timeout, if ctl has not changed, terminate the worker, which |
1552 |
< |
* will in turn wake up another worker to possibly repeat this |
1553 |
< |
* process. |
1549 |
> |
* checks for pool termination, and, so long as this is not the |
1550 |
> |
* only worker, waits for event for up to SHRINK_RATE nanosecs. |
1551 |
> |
* On timeout, if ctl has not changed, terminates the worker, |
1552 |
> |
* which will in turn wake up another worker to possibly repeat |
1553 |
> |
* this process. |
1554 |
|
* |
1555 |
|
* @param w the calling worker |
1556 |
|
*/ |
1746 |
|
} |
1747 |
|
|
1748 |
|
/** |
1749 |
< |
* Gets and removes a local or stolen task for the given worker |
1749 |
> |
* Gets and removes a local or stolen task for the given worker. |
1750 |
|
* |
1751 |
|
* @return a task, if available |
1752 |
|
*/ |