[ViewVC] Diff of: jsr166/jsr166/src/jsr166y/ForkJoinPool.java

Comparing jsr166/src/jsr166y/ForkJoinPool.java (file contents):
Revision 1.59 by dl, Fri Jul 23 14:09:17 2010 UTC vs.
Revision 1.70 by dl, Sat Sep 4 11:33:53 2010 UTC

#	Line 7 \| Line 7
7		package jsr166y;
8
9		import java.util.concurrent.*;
10	–
10		import java.util.ArrayList;
11		import java.util.Arrays;
12		import java.util.Collection;
#	Line 52 \| Line 51 \| import java.util.concurrent.CountDownLat
51		* convenient form for informal monitoring.
52		*
53		* <p> As is the case with other ExecutorServices, there are three
54	<	* main task execution methods summarized in the follwoing
54	>	* main task execution methods summarized in the following
55		* table. These are designed to be used by clients not already engaged
56		* in fork/join computations in the current pool. The main forms of
57		* these methods accept instances of {@code ForkJoinTask}, but
#	Line 69 \| Line 68 \| import java.util.concurrent.CountDownLat
68		* <td ALIGN=CENTER> <b>Call from within fork/join computations</b></td>
69		* </tr>
70		* <tr>
71	<	* <td> <b>Arange async execution</td>
71	>	* <td> <b>Arrange async execution</td>
72		* <td> {@link #execute(ForkJoinTask)}</td>
73		* <td> {@link ForkJoinTask#fork}</td>
74		* </tr>
#	Line 110 \| Line 109 \| import java.util.concurrent.CountDownLat
109		*
110		* <p>This implementation rejects submitted tasks (that is, by throwing
111		* {@link RejectedExecutionException}) only when the pool is shut down
112	<	* or internal resources have been exhuasted.
112	>	* or internal resources have been exhausted.
113		*
114		* @since 1.7
115		* @author Doug Lea
#	Line 138 \| Line 137 \| public class ForkJoinPool extends Abstra
137		* cache pollution effects.)
138		*
139		* Beyond work-stealing support and essential bookkeeping, the
140	<	* main responsibility of this framework is to arrange tactics for
141	<	* when one worker is waiting to join a task stolen (or always
142	<	* held by) another. Becauae we are multiplexing many tasks on to
143	<	* a pool of workers, we can't just let them block (as in
144	<	* Thread.join). We also cannot just reassign the joiner's
145	<	* run-time stack with another and replace it later, which would
146	<	* be a form of "continuation", that even if possible is not
147	<	* necessarily a good idea. Given that the creation costs of most
148	<	* threads on most systems mainly surrounds setting up runtime
149	<	* stacks, thread creation and switching is usually not much more
150	<	* expensive than stack creation and switching, and is more
151	<	* flexible). Instead we combine two tactics:
140	>	* main responsibility of this framework is to take actions when
141	>	* one worker is waiting to join a task stolen (or always held by)
142	>	* another. Because we are multiplexing many tasks on to a pool
143	>	* of workers, we can't just let them block (as in Thread.join).
144	>	* We also cannot just reassign the joiner's run-time stack with
145	>	* another and replace it later, which would be a form of
146	>	* "continuation", that even if possible is not necessarily a good
147	>	* idea. Given that the creation costs of most threads on most
148	>	* systems mainly surrounds setting up runtime stacks, thread
149	>	* creation and switching is usually not much more expensive than
150	>	* stack creation and switching, and is more flexible). Instead we
151	>	* combine two tactics:
152		*
153	<	* 1. Arranging for the joiner to execute some task that it
153	>	* Helping: Arranging for the joiner to execute some task that it
154		* would be running if the steal had not occurred. Method
155		* ForkJoinWorkerThread.helpJoinTask tracks joining->stealing
156		* links to try to find such a task.
157		*
158	<	* 2. Unless there are already enough live threads, creating or
159	<	* or re-activating a spare thread to compensate for the
160	<	* (blocked) joiner until it unblocks. Spares then suspend
161	<	* at their next opportunity or eventually die if unused for
162	<	* too long. See below and the internal documentation
163	<	* for tryAwaitJoin for more details about compensation
164	<	* rules.
165	<	*
166	<	* Because the determining existence of conservatively safe
167	<	* helping targets, the availability of already-created spares,
168	<	* and the apparent need to create new spares are all racy and
169	<	* require heuristic guidance, joins (in
170	<	* ForkJoinWorkerThread.joinTask) interleave these options until
171	<	* successful. Creating a new spare always succeeds, but also
172	<	* increases application footprint, so we try to avoid it, within
173	<	* reason.
158	>	* Compensating: Unless there are already enough live threads,
159	>	* method helpMaintainParallelism() may create or
160	>	* re-activate a spare thread to compensate for blocked
161	>	* joiners until they unblock.
162	>	*
163	>	* It is impossible to keep exactly the target (parallelism)
164	>	* number of threads running at any given time. Determining
165	>	* existence of conservatively safe helping targets, the
166	>	* availability of already-created spares, and the apparent need
167	>	* to create new spares are all racy and require heuristic
168	>	* guidance, so we rely on multiple retries of each. Compensation
169	>	* occurs in slow-motion. It is triggered only upon timeouts of
170	>	* Object.wait used for joins. This reduces poor decisions that
171	>	* would otherwise be made when threads are waiting for others
172	>	* that are stalled because of unrelated activities such as
173	>	* garbage collection.
174		*
175	<	* The ManagedBlocker extension API can't use option (1) so uses a
176	<	* special version of (2) in method awaitBlocker.
175	>	* The ManagedBlocker extension API can't use helping so relies
176	>	* only on compensation in method awaitBlocker.
177		*
178		* The main throughput advantages of work-stealing stem from
179		* decentralized control -- workers mostly steal tasks from each
#	Line 207 \| Line 206 \| public class ForkJoinPool extends Abstra
206		* blocked workers. However, all other support code is set up to
207		* work with other policies.
208		*
209	+	* To ensure that we do not hold on to worker references that
210	+	* would prevent GC, ALL accesses to workers are via indices into
211	+	* the workers array (which is one source of some of the unusual
212	+	* code constructions here). In essence, the workers array serves
213	+	* as a WeakReference mechanism. Thus for example the event queue
214	+	* stores worker indices, not worker references. Access to the
215	+	* workers in associated methods (for example releaseEventWaiters)
216	+	* must both index-check and null-check the IDs. All such accesses
217	+	* ignore bad IDs by returning out early from what they are doing,
218	+	* since this can only be associated with shutdown, in which case
219	+	* it is OK to give up. On termination, we just clobber these
220	+	* data structures without trying to use them.
221	+	*
222		* 2. Bookkeeping for dynamically adding and removing workers. We
223		* aim to approximately maintain the given level of parallelism.
224		* When some workers are known to be blocked (on joins or via
225		* ManagedBlocker), we may create or resume others to take their
226		* place until they unblock (see below). Implementing this
227		* requires counts of the number of "running" threads (i.e., those
228	<	* that are neither blocked nor artifically suspended) as well as
228	>	* that are neither blocked nor artificially suspended) as well as
229		* the total number. These two values are packed into one field,
230		* "workerCounts" because we need accurate snapshots when deciding
231		* to create, resume or suspend. Note however that the
232	<	* correspondance of these counts to reality is not guaranteed. In
232	>	* correspondence of these counts to reality is not guaranteed. In
233		* particular updates for unblocked threads may lag until they
234		* actually wake up.
235		*
#	Line 248 \| Line 260 \| public class ForkJoinPool extends Abstra
260		* workers that previously could not find a task to now find one:
261		* Submission of a new task to the pool, or another worker pushing
262		* a task onto a previously empty queue. (We also use this
263	<	* mechanism for termination and reconfiguration actions that
263	>	* mechanism for configuration and termination actions that
264		* require wakeups of idle workers). Each worker maintains its
265		* last known event count, and blocks when a scan for work did not
266		* find a task AND its lastEventCount matches the current
#	Line 259 \| Line 271 \| public class ForkJoinPool extends Abstra
271		* a record (field nextEventWaiter) for the next waiting worker.
272		* In addition to allowing simpler decisions about need for
273		* wakeup, the event count bits in eventWaiters serve the role of
274	<	* tags to avoid ABA errors in Treiber stacks. To reduce delays
275	<	* in task diffusion, workers not otherwise occupied may invoke
276	<	* method releaseWaiters, that removes and signals (unparks)
277	<	* workers not waiting on current count. To minimize task
278	<	* production stalls associate with signalling, any worker pushing
279	<	* a task on an empty queue invokes the weaker method signalWork,
268	<	* that only releases idle workers until it detects interference
269	<	* by other threads trying to release, and lets them take
270	<	* over. The net effect is a tree-like diffusion of signals, where
271	<	* released threads (and possibly others) help with unparks. To
272	<	* further reduce contention effects a bit, failed CASes to
273	<	* increment field eventCount are tolerated without retries.
274	>	* tags to avoid ABA errors in Treiber stacks. Upon any wakeup,
275	>	* released threads also try to release at most two others. The
276	>	* net effect is a tree-like diffusion of signals, where released
277	>	* threads (and possibly others) help with unparks. To further
278	>	* reduce contention effects a bit, failed CASes to increment
279	>	* field eventCount are tolerated without retries in signalWork.
280		* Conceptually they are merged into the same event, which is OK
281		* when their only purpose is to enable workers to scan for work.
282		*
283	<	* 5. Managing suspension of extra workers. When a worker is about
284	<	* to block waiting for a join (or via ManagedBlockers), we may
285	<	* create a new thread to maintain parallelism level, or at least
286	<	* avoid starvation. Usually, extra threads are needed for only
287	<	* very short periods, yet join dependencies are such that we
288	<	* sometimes need them in bursts. Rather than create new threads
289	<	* each time this happens, we suspend no-longer-needed extra ones
290	<	* as "spares". For most purposes, we don't distinguish "extra"
291	<	* spare threads from normal "core" threads: On each call to
292	<	* preStep (the only point at which we can do this) a worker
293	<	* checks to see if there are now too many running workers, and if
294	<	* so, suspends itself. Methods tryAwaitJoin and awaitBlocker
295	<	* look for suspended threads to resume before considering
296	<	* creating a new replacement. We don't need a special data
297	<	* structure to maintain spares; simply scanning the workers array
298	<	* looking for worker.isSuspended() is fine because the calling
299	<	* thread is otherwise not doing anything useful anyway; we are at
300	<	* least as happy if after locating a spare, the caller doesn't
301	<	* actually block because the join is ready before we try to
302	<	* adjust and compensate. Note that this is intrinsically racy.
303	<	* One thread may become a spare at about the same time as another
304	<	* is needlessly being created. We counteract this and related
305	<	* slop in part by requiring resumed spares to immediately recheck
306	<	* (in preStep) to see whether they they should re-suspend. The
307	<	* only effective difference between "extra" and "core" threads is
308	<	* that we allow the "extra" ones to time out and die if they are
309	<	* not resumed within a keep-alive interval of a few seconds. This
310	<	* is implemented mainly within ForkJoinWorkerThread, but requires
311	<	* some coordination (isTrimmed() -- meaning killed while
312	<	* suspended) to correctly maintain pool counts.
313	<	*
314	<	* 6. Deciding when to create new workers. The main dynamic
315	<	* control in this class is deciding when to create extra threads,
316	<	* in methods awaitJoin and awaitBlocker. We always need to create
317	<	* one when the number of running threads would become zero and
318	<	* all workers are busy. However, this is not easy to detect
319	<	* reliably in the presence of transients so we use retries and
320	<	* allow slack (in tryAwaitJoin) to reduce false alarms. These
321	<	* effectively reduce churn at the price of systematically
322	<	* undershooting target parallelism when many threads are blocked.
323	<	* However, biasing toward undeshooting partially compensates for
324	<	* the above mechanics to suspend extra threads, that normally
325	<	* lead to overshoot because we can only suspend workers
326	<	* in-between top-level actions. It also better copes with the
327	<	* fact that some of the methods in this class tend to never
328	<	* become compiled (but are interpreted), so some components of
329	<	* the entire set of controls might execute many times faster than
330	<	* others. And similarly for cases where the apparent lack of work
331	<	* is just due to GC stalls and other transient system activity.
283	>	* 5. Managing suspension of extra workers. When a worker notices
284	>	* (usually upon timeout of a wait()) that there are too few
285	>	* running threads, we may create a new thread to maintain
286	>	* parallelism level, or at least avoid starvation. Usually, extra
287	>	* threads are needed for only very short periods, yet join
288	>	* dependencies are such that we sometimes need them in
289	>	* bursts. Rather than create new threads each time this happens,
290	>	* we suspend no-longer-needed extra ones as "spares". For most
291	>	* purposes, we don't distinguish "extra" spare threads from
292	>	* normal "core" threads: On each call to preStep (the only point
293	>	* at which we can do this) a worker checks to see if there are
294	>	* now too many running workers, and if so, suspends itself.
295	>	* Method helpMaintainParallelism looks for suspended threads to
296	>	* resume before considering creating a new replacement. The
297	>	* spares themselves are encoded on another variant of a Treiber
298	>	* Stack, headed at field "spareWaiters". Note that the use of
299	>	* spares is intrinsically racy. One thread may become a spare at
300	>	* about the same time as another is needlessly being created. We
301	>	* counteract this and related slop in part by requiring resumed
302	>	* spares to immediately recheck (in preStep) to see whether they
303	>	* they should re-suspend.
304	>	*
305	>	* 6. Killing off unneeded workers. A timeout mechanism is used to
306	>	* shed unused workers: The oldest (first) event queue waiter uses
307	>	* a timed rather than hard wait. When this wait times out without
308	>	* a normal wakeup, it tries to shutdown any one (for convenience
309	>	* the newest) other spare or event waiter via
310	>	* tryShutdownUnusedWorker. This eventually reduces the number of
311	>	* worker threads to a minimum of one after a long enough period
312	>	* without use.
313	>	*
314	>	* 7. Deciding when to create new workers. The main dynamic
315	>	* control in this class is deciding when to create extra threads
316	>	* in method helpMaintainParallelism. We would like to keep
317	>	* exactly #parallelism threads running, which is an impossible
318	>	* task. We always need to create one when the number of running
319	>	* threads would become zero and all workers are busy. Beyond
320	>	* this, we must rely on heuristics that work well in the
321	>	* presence of transient phenomena such as GC stalls, dynamic
322	>	* compilation, and wake-up lags. These transients are extremely
323	>	* common -- we are normally trying to fully saturate the CPUs on
324	>	* a machine, so almost any activity other than running tasks
325	>	* impedes accuracy. Our main defense is to allow parallelism to
326	>	* lapse for a while during joins, and use a timeout to see if,
327	>	* after the resulting settling, there is still a need for
328	>	* additional workers. This also better copes with the fact that
329	>	* some of the methods in this class tend to never become compiled
330	>	* (but are interpreted), so some components of the entire set of
331	>	* controls might execute 100 times faster than others. And
332	>	* similarly for cases where the apparent lack of work is just due
333	>	* to GC stalls and other transient system activity.
334		*
335		* Beware that there is a lot of representation-level coupling
336		* among classes ForkJoinPool, ForkJoinWorkerThread, and
#	Line 335 \| Line 343 \| public class ForkJoinPool extends Abstra
343		*
344		* Style notes: There are lots of inline assignments (of form
345		* "while ((local = field) != 0)") which are usually the simplest
346	<	* way to ensure read orderings. Also several occurrences of the
347	<	* unusual "do {} while(!cas...)" which is the simplest way to
348	<	* force an update of a CAS'ed variable. There are also other
349	<	* coding oddities that help some methods perform reasonably even
350	<	* when interpreted (not compiled), at the expense of messiness.
346	>	* way to ensure the required read orderings (which are sometimes
347	>	* critical). Also several occurrences of the unusual "do {}
348	>	* while (!cas...)" which is the simplest way to force an update of
349	>	* a CAS'ed variable. There are also other coding oddities that
350	>	* help some methods perform reasonably even when interpreted (not
351	>	* compiled), at the expense of some messy constructions that
352	>	* reduce byte code counts.
353		*
354		* The order of declarations in this file is: (1) statics (2)
355		* fields (along with constants used when unpacking some of them)
#	Line 407 \| Line 417 \| public class ForkJoinPool extends Abstra
417		new AtomicInteger();
418
419		/**
420	<	* Absolute bound for parallelism level. Twice this number must
421	<	* fit into a 16bit field to enable word-packing for some counts.
420	>	* The time to block in a join (see awaitJoin) before checking if
421	>	* a new worker should be (re)started to maintain parallelism
422	>	* level. The value should be short enough to maintain global
423	>	* responsiveness and progress but long enough to avoid
424	>	* counterproductive firings during GC stalls or unrelated system
425	>	* activity, and to not bog down systems with continual re-firings
426	>	* on GCs or legitimately long waits.
427	>	*/
428	>	private static final long JOIN_TIMEOUT_MILLIS = 250L; // 4 per second
429	>
430	>	/**
431	>	* The wakeup interval (in nanoseconds) for the oldest worker
432	>	* worker waiting for an event invokes tryShutdownUnusedWorker to shrink
433	>	* the number of workers. The exact value does not matter too
434	>	* much, but should be long enough to slowly release resources
435	>	* during long periods without use without disrupting normal use.
436	>	*/
437	>	private static final long SHRINK_RATE_NANOS =
438	>	30L * 1000L * 1000L * 1000L; // 2 per minute
439	>
440	>	/**
441	>	* Absolute bound for parallelism level. Twice this number plus
442	>	* one (i.e., 0xfff) must fit into a 16bit field to enable
443	>	* word-packing for some counts and indices.
444		*/
445	<	private static final int MAX_THREADS = 0x7fff;
445	>	private static final int MAX_WORKERS = 0x7fff;
446
447		/**
448		* Array holding all worker threads in the pool. Array size must
#	Line 450 \| Line 482 \| public class ForkJoinPool extends Abstra
482		private volatile long stealCount;
483
484		/**
485	<	* Encoded record of top of treiber stack of threads waiting for
485	>	* Encoded record of top of Treiber stack of threads waiting for
486		* events. The top 32 bits contain the count being waited for. The
487	<	* bottom word contains one plus the pool index of waiting worker
488	<	* thread.
487	>	* bottom 16 bits contains one plus the pool index of waiting
488	>	* worker thread. (Bits 16-31 are unused.)
489		*/
490		private volatile long eventWaiters;
491
492		private static final int EVENT_COUNT_SHIFT = 32;
493	<	private static final long WAITER_ID_MASK = (1L << EVENT_COUNT_SHIFT)-1L;
493	>	private static final long WAITER_ID_MASK = (1L << 16) - 1L;
494
495		/**
496		* A counter for events that may wake up worker threads:
497		* - Submission of a new task to the pool
498		* - A worker pushing a task on an empty queue
499	<	* - termination and reconfiguration
499	>	* - termination
500		*/
501		private volatile int eventCount;
502
503		/**
504	+	* Encoded record of top of Treiber stack of spare threads waiting
505	+	* for resumption. The top 16 bits contain an arbitrary count to
506	+	* avoid ABA effects. The bottom 16bits contains one plus the pool
507	+	* index of waiting worker thread.
508	+	*/
509	+	private volatile int spareWaiters;
510	+
511	+	private static final int SPARE_COUNT_SHIFT = 16;
512	+	private static final int SPARE_ID_MASK = (1 << 16) - 1;
513	+
514	+	/**
515		* Lifecycle control. The low word contains the number of workers
516		* that are (probably) executing tasks. This value is atomically
517		* incremented before a worker gets a task to run, and decremented
#	Line 479 \| Line 522 \| public class ForkJoinPool extends Abstra
522		* These are bundled together to ensure consistent read for
523		* termination checks (i.e., that runLevel is at least SHUTDOWN
524		* and active threads is zero).
525	+	*
526	+	* Notes: Most direct CASes are dependent on these bitfield
527	+	* positions. Also, this field is non-private to enable direct
528	+	* performance-sensitive CASes in ForkJoinWorkerThread.
529		*/
530	<	private volatile int runState;
530	>	volatile int runState;
531
532		// Note: The order among run level values matters.
533		private static final int RUNLEVEL_SHIFT = 16;
#	Line 488 \| Line 535 \| public class ForkJoinPool extends Abstra
535		private static final int TERMINATING = 1 << (RUNLEVEL_SHIFT + 1);
536		private static final int TERMINATED = 1 << (RUNLEVEL_SHIFT + 2);
537		private static final int ACTIVE_COUNT_MASK = (1 << RUNLEVEL_SHIFT) - 1;
491	–	private static final int ONE_ACTIVE = 1; // active update delta
538
539		/**
540		* Holds number of total (i.e., created and not yet terminated)
#	Line 497 \| Line 543 \| public class ForkJoinPool extends Abstra
543		* making decisions about creating and suspending spare
544		* threads. Updated only by CAS. Note that adding a new worker
545		* requires incrementing both counts, since workers start off in
546	<	* running state. This field is also used for memory-fencing
501	<	* configuration parameters.
546	>	* running state.
547		*/
548		private volatile int workerCounts;
549
#	Line 530 \| Line 575 \| public class ForkJoinPool extends Abstra
575		*/
576		private final int poolNumber;
577
578	<	// Utilities for CASing fields. Note that several of these
579	<	// are manually inlined by callers
578	>	// Utilities for CASing fields. Note that most of these
579	>	// are usually manually inlined by callers
580
581		/**
582	<	* Increments running count. Also used by ForkJoinTask.
582	>	* Increments running count part of workerCounts
583		*/
584		final void incrementRunningCount() {
585		int c;
#	Line 555 \| Line 600 \| public class ForkJoinPool extends Abstra
600		}
601
602		/**
603	<	* Tries to increment running count
604	<	*/
560	<	final boolean tryIncrementRunningCount() {
561	<	int wc;
562	<	return UNSAFE.compareAndSwapInt(this, workerCountsOffset,
563	<	wc = workerCounts, wc + ONE_RUNNING);
564	<	}
565	<
566	<	/**
567	<	* Tries incrementing active count; fails on contention.
568	<	* Called by workers before executing tasks.
603	>	* Forces decrement of encoded workerCounts, awaiting nonzero if
604	>	* (rarely) necessary when other count updates lag.
605		*
606	<	* @return true on success
606	>	* @param dr -- either zero or ONE_RUNNING
607	>	* @param dt == either zero or ONE_TOTAL
608		*/
609	<	final boolean tryIncrementActiveCount() {
610	<	int c;
611	<	return UNSAFE.compareAndSwapInt(this, runStateOffset,
612	<	c = runState, c + ONE_ACTIVE);
609	>	private void decrementWorkerCounts(int dr, int dt) {
610	>	for (;;) {
611	>	int wc = workerCounts;
612	>	if ((wc & RUNNING_COUNT_MASK) - dr < 0 \|\|
613	>	(wc >>> TOTAL_COUNT_SHIFT) - dt < 0) {
614	>	if ((runState & TERMINATED) != 0)
615	>	return; // lagging termination on a backout
616	>	Thread.yield();
617	>	}
618	>	if (UNSAFE.compareAndSwapInt(this, workerCountsOffset,
619	>	wc, wc - (dr + dt)))
620	>	return;
621	>	}
622		}
623
624		/**
#	Line 582 \| Line 628 \| public class ForkJoinPool extends Abstra
628		final boolean tryDecrementActiveCount() {
629		int c;
630		return UNSAFE.compareAndSwapInt(this, runStateOffset,
631	<	c = runState, c - ONE_ACTIVE);
631	>	c = runState, c - 1);
632		}
633
634		/**
#	Line 611 \| Line 657 \| public class ForkJoinPool extends Abstra
657		lock.lock();
658		try {
659		ForkJoinWorkerThread[] ws = workers;
660	<	int nws = ws.length;
661	<	if (k < 0 \|\| k >= nws \|\| ws[k] != null) {
662	<	for (k = 0; k < nws && ws[k] != null; ++k)
660	>	int n = ws.length;
661	>	if (k < 0 \|\| k >= n \|\| ws[k] != null) {
662	>	for (k = 0; k < n && ws[k] != null; ++k)
663		;
664	<	if (k == nws)
665	<	ws = Arrays.copyOf(ws, nws << 1);
664	>	if (k == n)
665	>	ws = Arrays.copyOf(ws, n << 1);
666		}
667		ws[k] = w;
668		workers = ws; // volatile array write ensures slot visibility
#	Line 631 \| Line 677 \| public class ForkJoinPool extends Abstra
677		*/
678		private void forgetWorker(ForkJoinWorkerThread w) {
679		int idx = w.poolIndex;
680	<	// Locking helps method recordWorker avoid unecessary expansion
680	>	// Locking helps method recordWorker avoid unnecessary expansion
681		final ReentrantLock lock = this.workerLock;
682		lock.lock();
683		try {
#	Line 643 \| Line 689 \| public class ForkJoinPool extends Abstra
689		}
690		}
691
646	–	// adding and removing workers
647	–
692		/**
693	<	* Tries to create and add new worker. Assumes that worker counts
694	<	* are already updated to accommodate the worker, so adjusts on
695	<	* failure.
693	>	* Final callback from terminating worker. Removes record of
694	>	* worker from array, and adjusts counts. If pool is shutting
695	>	* down, tries to complete termination.
696		*
697	<	* @return new worker or null if creation failed
697	>	* @param w the worker
698		*/
699	<	private ForkJoinWorkerThread addWorker() {
700	<	ForkJoinWorkerThread w = null;
701	<	try {
702	<	w = factory.newThread(this);
703	<	} finally { // Adjust on either null or exceptional factory return
704	<	if (w == null) {
661	<	onWorkerCreationFailure();
662	<	return null;
663	<	}
664	<	}
665	<	w.start(recordWorker(w), ueh);
666	<	return w;
699	>	final void workerTerminated(ForkJoinWorkerThread w) {
700	>	forgetWorker(w);
701	>	decrementWorkerCounts(w.isTrimmed()? 0 : ONE_RUNNING, ONE_TOTAL);
702	>	while (w.stealCount != 0) // collect final count
703	>	tryAccumulateStealCount(w);
704	>	tryTerminate(false);
705		}
706
707	+	// Waiting for and signalling events
708	+
709		/**
710	<	* Adjusts counts upon failure to create worker
710	>	* Releases workers blocked on a count not equal to current count.
711	>	* Normally called after precheck that eventWaiters isn't zero to
712	>	* avoid wasted array checks. Gives up upon a change in count or
713	>	* upon releasing two workers, letting others take over.
714		*/
715	<	private void onWorkerCreationFailure() {
716	<	for (;;) {
717	<	int wc = workerCounts;
718	<	if ((wc >>> TOTAL_COUNT_SHIFT) == 0)
719	<	Thread.yield(); // wait for other counts to settle
720	<	else if (UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
721	<	wc - (ONE_RUNNING\|ONE_TOTAL)))
715	>	private void releaseEventWaiters() {
716	>	ForkJoinWorkerThread[] ws = workers;
717	>	int n = ws.length;
718	>	long h = eventWaiters;
719	>	int ec = eventCount;
720	>	boolean releasedOne = false;
721	>	ForkJoinWorkerThread w; int id;
722	>	while ((id = ((int)(h & WAITER_ID_MASK)) - 1) >= 0 &&
723	>	(int)(h >>> EVENT_COUNT_SHIFT) != ec &&
724	>	id < n && (w = ws[id]) != null) {
725	>	if (UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
726	>	h, w.nextWaiter)) {
727	>	LockSupport.unpark(w);
728	>	if (releasedOne) // exit on second release
729	>	break;
730	>	releasedOne = true;
731	>	}
732	>	if (eventCount != ec)
733		break;
734	+	h = eventWaiters;
735		}
681	–	tryTerminate(false); // in case of failure during shutdown
736		}
737
738		/**
739	<	* Creates and/or resumes enough workers to establish target
740	<	* parallelism, giving up if terminating or addWorker fails
687	<	*
688	<	* TODO: recast this to support lazier creation and automated
689	<	* parallelism maintenance
739	>	* Tries to advance eventCount and releases waiters. Called only
740	>	* from workers.
741		*/
742	<	private void ensureEnoughWorkers() {
743	<	while ((runState & TERMINATING) == 0) {
744	<	int pc = parallelism;
745	<	int wc = workerCounts;
746	<	int rc = wc & RUNNING_COUNT_MASK;
696	<	int tc = wc >>> TOTAL_COUNT_SHIFT;
697	<	if (tc < pc) {
698	<	if (UNSAFE.compareAndSwapInt
699	<	(this, workerCountsOffset,
700	<	wc, wc + (ONE_RUNNING\|ONE_TOTAL)) &&
701	<	addWorker() == null)
702	<	break;
703	<	}
704	<	else if (tc > pc && rc < pc &&
705	<	tc > (runState & ACTIVE_COUNT_MASK)) {
706	<	ForkJoinWorkerThread spare = null;
707	<	ForkJoinWorkerThread[] ws = workers;
708	<	int nws = ws.length;
709	<	for (int i = 0; i < nws; ++i) {
710	<	ForkJoinWorkerThread w = ws[i];
711	<	if (w != null && w.isSuspended()) {
712	<	if ((workerCounts & RUNNING_COUNT_MASK) > pc)
713	<	return;
714	<	if (w.tryResumeSpare())
715	<	incrementRunningCount();
716	<	break;
717	<	}
718	<	}
719	<	}
720	<	else
721	<	break;
722	<	}
742	>	final void signalWork() {
743	>	int c; // try to increment event count -- CAS failure OK
744	>	UNSAFE.compareAndSwapInt(this, eventCountOffset, c = eventCount, c+1);
745	>	if (eventWaiters != 0L)
746	>	releaseEventWaiters();
747		}
748
749		/**
750	<	* Final callback from terminating worker. Removes record of
751	<	* worker from array, and adjusts counts. If pool is shutting
728	<	* down, tries to complete terminatation, else possibly replaces
729	<	* the worker.
750	>	* Adds the given worker to event queue and blocks until
751	>	* terminating or event count advances from the given value
752		*
753	<	* @param w the worker
753	>	* @param w the calling worker thread
754	>	* @param ec the count
755		*/
756	<	final void workerTerminated(ForkJoinWorkerThread w) {
757	<	if (w.active) { // force inactive
758	<	w.active = false;
759	<	do {} while (!tryDecrementActiveCount());
760	<	}
761	<	forgetWorker(w);
762	<
763	<	// Decrement total count, and if was running, running count
764	<	// Spin (waiting for other updates) if either would be negative
765	<	int nr = w.isTrimmed() ? 0 : ONE_RUNNING;
743	<	int unit = ONE_TOTAL + nr;
744	<	for (;;) {
745	<	int wc = workerCounts;
746	<	int rc = wc & RUNNING_COUNT_MASK;
747	<	if (rc - nr < 0 \|\| (wc >>> TOTAL_COUNT_SHIFT) == 0)
748	<	Thread.yield(); // back off if waiting for other updates
749	<	else if (UNSAFE.compareAndSwapInt(this, workerCountsOffset,
750	<	wc, wc - unit))
756	>	private void eventSync(ForkJoinWorkerThread w, int ec) {
757	>	long nh = (((long)ec) << EVENT_COUNT_SHIFT) \| ((long)(w.poolIndex+1));
758	>	long h;
759	>	while ((runState < SHUTDOWN \|\| !tryTerminate(false)) &&
760	>	(((int)((h = eventWaiters) & WAITER_ID_MASK)) == 0 \|\|
761	>	(int)(h >>> EVENT_COUNT_SHIFT) == ec) &&
762	>	eventCount == ec) {
763	>	if (UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
764	>	w.nextWaiter = h, nh)) {
765	>	awaitEvent(w, ec);
766		break;
767	+	}
768		}
753	–
754	–	accumulateStealCount(w); // collect final count
755	–	if (!tryTerminate(false))
756	–	ensureEnoughWorkers();
769		}
770
759	–	// Waiting for and signalling events
760	–
771		/**
772	<	* Releases workers blocked on a count not equal to current count.
773	<	* @return true if any released
772	>	* Blocks the given worker (that has already been entered as an
773	>	* event waiter) until terminating or event count advances from
774	>	* the given value. The oldest (first) waiter uses a timed wait to
775	>	* occasionally one-by-one shrink the number of workers (to a
776	>	* minimum of one) if the pool has not been used for extended
777	>	* periods.
778	>	*
779	>	* @param w the calling worker thread
780	>	* @param ec the count
781		*/
782	<	private void releaseWaiters() {
783	<	long top;
784	<	while ((top = eventWaiters) != 0L) {
785	<	ForkJoinWorkerThread[] ws = workers;
786	<	int n = ws.length;
787	<	for (;;) {
788	<	int i = ((int)(top & WAITER_ID_MASK)) - 1;
789	<	if (i < 0 \|\| (int)(top >>> EVENT_COUNT_SHIFT) == eventCount)
790	<	return;
791	<	ForkJoinWorkerThread w;
792	<	if (i < n && (w = ws[i]) != null &&
793	<	UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
794	<	top, w.nextWaiter)) {
795	<	LockSupport.unpark(w);
796	<	top = eventWaiters;
782	>	private void awaitEvent(ForkJoinWorkerThread w, int ec) {
783	>	while (eventCount == ec) {
784	>	if (tryAccumulateStealCount(w)) { // transfer while idle
785	>	boolean untimed = (w.nextWaiter != 0L \|\|
786	>	(workerCounts & RUNNING_COUNT_MASK) <= 1);
787	>	long startTime = untimed? 0 : System.nanoTime();
788	>	Thread.interrupted(); // clear/ignore interrupt
789	>	if (eventCount != ec \|\| w.runState != 0 \|\|
790	>	runState >= TERMINATING) // recheck after clear
791	>	break;
792	>	if (untimed)
793	>	LockSupport.park(w);
794	>	else {
795	>	LockSupport.parkNanos(w, SHRINK_RATE_NANOS);
796	>	if (eventCount != ec \|\| w.runState != 0 \|\|
797	>	runState >= TERMINATING)
798	>	break;
799	>	if (System.nanoTime() - startTime >= SHRINK_RATE_NANOS)
800	>	tryShutdownUnusedWorker(ec);
801		}
781	–	else
782	–	break; // possibly stale; reread
802		}
803		}
804		}
805
806	+	// Maintaining parallelism
807	+
808		/**
809	<	* Ensures eventCount on exit is different (mod 2^32) than on
789	<	* entry and wakes up all waiters
809	>	* Pushes worker onto the spare stack
810		*/
811	<	private void signalEvent() {
812	<	int c;
813	<	do {} while (!UNSAFE.compareAndSwapInt(this, eventCountOffset,
814	<	c = eventCount, c+1));
795	<	releaseWaiters();
811	>	final void pushSpare(ForkJoinWorkerThread w) {
812	>	int ns = (++w.spareCount << SPARE_COUNT_SHIFT) \| (w.poolIndex + 1);
813	>	do {} while (!UNSAFE.compareAndSwapInt(this, spareWaitersOffset,
814	>	w.nextSpare = spareWaiters,ns));
815		}
816
817		/**
818	<	* Advances eventCount and releases waiters until interference by
819	<	* other releasing threads is detected.
818	>	* Tries (once) to resume a spare if the number of running
819	>	* threads is less than target.
820		*/
821	<	final void signalWork() {
822	<	int c;
823	<	UNSAFE.compareAndSwapInt(this, eventCountOffset, c=eventCount, c+1);
824	<	long top;
825	<	while ((top = eventWaiters) != 0L) {
826	<	int ec = eventCount;
827	<	ForkJoinWorkerThread[] ws = workers;
828	<	int n = ws.length;
829	<	for (;;) {
830	<	int i = ((int)(top & WAITER_ID_MASK)) - 1;
831	<	if (i < 0 \|\| (int)(top >>> EVENT_COUNT_SHIFT) == ec)
832	<	return;
833	<	ForkJoinWorkerThread w;
834	<	if (i < n && (w = ws[i]) != null &&
835	<	UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
836	<	top, top = w.nextWaiter)) {
837	<	LockSupport.unpark(w);
838	<	if (top != eventWaiters) // let someone else take over
839	<	return;
840	<	}
822	<	else
823	<	break; // possibly stale; reread
824	<	}
821	>	private void tryResumeSpare() {
822	>	int sw, id;
823	>	ForkJoinWorkerThread[] ws = workers;
824	>	int n = ws.length;
825	>	ForkJoinWorkerThread w;
826	>	if ((sw = spareWaiters) != 0 &&
827	>	(id = (sw & SPARE_ID_MASK) - 1) >= 0 &&
828	>	id < n && (w = ws[id]) != null &&
829	>	(workerCounts & RUNNING_COUNT_MASK) < parallelism &&
830	>	spareWaiters == sw &&
831	>	UNSAFE.compareAndSwapInt(this, spareWaitersOffset,
832	>	sw, w.nextSpare)) {
833	>	int c; // increment running count before resume
834	>	do {} while (!UNSAFE.compareAndSwapInt
835	>	(this, workerCountsOffset,
836	>	c = workerCounts, c + ONE_RUNNING));
837	>	if (w.tryUnsuspend())
838	>	LockSupport.unpark(w);
839	>	else // back out if w was shutdown
840	>	decrementWorkerCounts(ONE_RUNNING, 0);
841		}
842		}
843
844		/**
845	<	* If worker is inactive, blocks until terminating or event count
846	<	* advances from last value held by worker; in any case helps
847	<	* release others.
848	<	*
849	<	* @param w the calling worker thread
834	<	* @param retries the number of scans by caller failing to find work
835	<	* @return false if now too many threads running
845	>	* Tries to increase the number of running workers if below target
846	>	* parallelism: If a spare exists tries to resume it via
847	>	* tryResumeSpare. Otherwise, if not enough total workers or all
848	>	* existing workers are busy, adds a new worker. In all cases also
849	>	* helps wake up releasable workers waiting for work.
850		*/
851	<	private boolean eventSync(ForkJoinWorkerThread w, int retries) {
852	<	int wec = w.lastEventCount;
853	<	if (retries > 1) { // can only block after 2nd miss
854	<	long nextTop = (((long)wec << EVENT_COUNT_SHIFT) \|
855	<	((long)(w.poolIndex + 1)));
856	<	long top;
857	<	while ((runState < SHUTDOWN \|\| !tryTerminate(false)) &&
858	<	(((int)(top = eventWaiters) & WAITER_ID_MASK) == 0 \|\|
859	<	(int)(top >>> EVENT_COUNT_SHIFT) == wec) &&
860	<	eventCount == wec) {
861	<	if (UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
862	<	w.nextWaiter = top, nextTop)) {
863	<	accumulateStealCount(w); // transfer steals while idle
864	<	Thread.interrupted(); // clear/ignore interrupt
865	<	while (eventCount == wec)
866	<	w.doPark();
851	>	private void helpMaintainParallelism() {
852	>	int pc = parallelism;
853	>	int wc, rs, tc;
854	>	while (((wc = workerCounts) & RUNNING_COUNT_MASK) < pc &&
855	>	(rs = runState) < TERMINATING) {
856	>	if (spareWaiters != 0)
857	>	tryResumeSpare();
858	>	else if ((tc = wc >>> TOTAL_COUNT_SHIFT) >= MAX_WORKERS \|\|
859	>	(tc >= pc && (rs & ACTIVE_COUNT_MASK) != tc))
860	>	break; // enough total
861	>	else if (runState == rs && workerCounts == wc &&
862	>	UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
863	>	wc + (ONE_RUNNING\|ONE_TOTAL))) {
864	>	ForkJoinWorkerThread w = null;
865	>	try {
866	>	w = factory.newThread(this);
867	>	} finally { // adjust on null or exceptional factory return
868	>	if (w == null) {
869	>	decrementWorkerCounts(ONE_RUNNING, ONE_TOTAL);
870	>	tryTerminate(false); // handle failure during shutdown
871	>	}
872	>	}
873	>	if (w == null)
874		break;
875	+	w.start(recordWorker(w), ueh);
876	+	if ((workerCounts >>> TOTAL_COUNT_SHIFT) >= pc) {
877	+	int c; // advance event count
878	+	UNSAFE.compareAndSwapInt(this, eventCountOffset,
879	+	c = eventCount, c+1);
880	+	break; // add at most one unless total below target
881		}
882		}
856	–	wec = eventCount;
883		}
884	<	releaseWaiters();
885	<	int wc = workerCounts;
886	<	if ((wc & RUNNING_COUNT_MASK) <= parallelism) {
887	<	w.lastEventCount = wec;
888	<	return true;
884	>	if (eventWaiters != 0L)
885	>	releaseEventWaiters();
886	>	}
887	>
888	>	/**
889	>	* Callback from the oldest waiter in awaitEvent waking up after a
890	>	* period of non-use. If all workers are idle, tries (once) to
891	>	* shutdown an event waiter or a spare, if one exists. Note that
892	>	* we don't need CAS or locks here because the method is called
893	>	* only from one thread occasionally waking (and even misfires are
894	>	* OK). Note that until the shutdown worker fully terminates,
895	>	* workerCounts will overestimate total count, which is tolerable.
896	>	*
897	>	* @param ec the event count waited on by caller (to abort
898	>	* attempt if count has since changed).
899	>	*/
900	>	private void tryShutdownUnusedWorker(int ec) {
901	>	if (runState == 0 && eventCount == ec) { // only trigger if all idle
902	>	ForkJoinWorkerThread[] ws = workers;
903	>	int n = ws.length;
904	>	ForkJoinWorkerThread w = null;
905	>	boolean shutdown = false;
906	>	int sw;
907	>	long h;
908	>	if ((sw = spareWaiters) != 0) { // prefer killing spares
909	>	int id = (sw & SPARE_ID_MASK) - 1;
910	>	if (id >= 0 && id < n && (w = ws[id]) != null &&
911	>	UNSAFE.compareAndSwapInt(this, spareWaitersOffset,
912	>	sw, w.nextSpare))
913	>	shutdown = true;
914	>	}
915	>	else if ((h = eventWaiters) != 0L) {
916	>	long nh;
917	>	int id = ((int)(h & WAITER_ID_MASK)) - 1;
918	>	if (id >= 0 && id < n && (w = ws[id]) != null &&
919	>	(nh = w.nextWaiter) != 0L && // keep at least one worker
920	>	UNSAFE.compareAndSwapLong(this, eventWaitersOffset, h, nh))
921	>	shutdown = true;
922	>	}
923	>	if (w != null && shutdown) {
924	>	w.shutdown();
925	>	LockSupport.unpark(w);
926	>	}
927		}
928	<	if (wec != w.lastEventCount) // back up if may re-wait
865	<	w.lastEventCount = wec - (wc >>> TOTAL_COUNT_SHIFT);
866	<	return false;
928	>	releaseEventWaiters(); // in case of interference
929		}
930
931		/**
932		* Callback from workers invoked upon each top-level action (i.e.,
933	<	* stealing a task or taking a submission and running
934	<	* it). Performs one or both of the following:
933	>	* stealing a task or taking a submission and running it).
934	>	* Performs one or more of the following:
935		*
936	<	* * If the worker cannot find work, updates its active status to
937	<	* inactive and updates activeCount unless there is contention, in
938	<	* which case it may try again (either in this or a subsequent
939	<	* call). Additionally, awaits the next task event and/or helps
940	<	* wake up other releasable waiters.
941	<	*
942	<	* * If there are too many running threads, suspends this worker
943	<	* (first forcing inactivation if necessary). If it is not
944	<	* resumed before a keepAlive elapses, the worker may be "trimmed"
945	<	* -- killed while suspended within suspendAsSpare. Otherwise,
946	<	* upon resume it rechecks to make sure that it is still needed.
936	>	* 1. If the worker is active and either did not run a task
937	>	* or there are too many workers, try to set its active status
938	>	* to inactive and update activeCount. On contention, we may
939	>	* try again in this or a subsequent call.
940	>	*
941	>	* 2. If not enough total workers, help create some.
942	>	*
943	>	* 3. If there are too many running workers, suspend this worker
944	>	* (first forcing inactive if necessary). If it is not needed,
945	>	* it may be shutdown while suspended (via
946	>	* tryShutdownUnusedWorker). Otherwise, upon resume it
947	>	* rechecks running thread count and need for event sync.
948	>	*
949	>	* 4. If worker did not run a task, await the next task event via
950	>	* eventSync if necessary (first forcing inactivation), upon
951	>	* which the worker may be shutdown via
952	>	* tryShutdownUnusedWorker. Otherwise, help release any
953	>	* existing event waiters that are now releasable,
954		*
955		* @param w the worker
956	<	* @param retries the number of scans by caller failing to find work
888	<	* find any (in which case it may block waiting for work).
956	>	* @param ran true if worker ran a task since last call to this method
957		*/
958	<	final void preStep(ForkJoinWorkerThread w, int retries) {
958	>	final void preStep(ForkJoinWorkerThread w, boolean ran) {
959	>	int wec = w.lastEventCount;
960		boolean active = w.active;
961	<	boolean inactivate = active && retries != 0;
962	<	for (;;) {
963	<	int rs, wc;
964	<	if (inactivate &&
965	<	UNSAFE.compareAndSwapInt(this, runStateOffset,
966	<	rs = runState, rs - ONE_ACTIVE))
961	>	boolean inactivate = false;
962	>	int pc = parallelism;
963	>	int rs;
964	>	while (w.runState == 0 && (rs = runState) < TERMINATING) {
965	>	if ((inactivate \|\| (active && (rs & ACTIVE_COUNT_MASK) >= pc)) &&
966	>	UNSAFE.compareAndSwapInt(this, runStateOffset, rs, rs - 1))
967		inactivate = active = w.active = false;
968	<	if (((wc = workerCounts) & RUNNING_COUNT_MASK) <= parallelism) {
969	<	if (active \|\| eventSync(w, retries))
968	>	int wc = workerCounts;
969	>	if ((wc & RUNNING_COUNT_MASK) > pc) {
970	>	if (!(inactivate \|= active) && // must inactivate to suspend
971	>	workerCounts == wc && // try to suspend as spare
972	>	UNSAFE.compareAndSwapInt(this, workerCountsOffset,
973	>	wc, wc - ONE_RUNNING))
974	>	w.suspendAsSpare();
975	>	}
976	>	else if ((wc >>> TOTAL_COUNT_SHIFT) < pc)
977	>	helpMaintainParallelism(); // not enough workers
978	>	else if (!ran) {
979	>	long h = eventWaiters;
980	>	int ec = eventCount;
981	>	if (h != 0L && (int)(h >>> EVENT_COUNT_SHIFT) != ec)
982	>	releaseEventWaiters(); // release others before waiting
983	>	else if (ec != wec) {
984	>	w.lastEventCount = ec; // no need to wait
985		break;
986	+	}
987	+	else if (!(inactivate \|= active))
988	+	eventSync(w, wec); // must inactivate before sync
989		}
990	<	else if (!(inactivate \|= active) && // must inactivate to suspend
904	<	UNSAFE.compareAndSwapInt(this, workerCountsOffset,
905	<	wc, wc - ONE_RUNNING) &&
906	<	!w.suspendAsSpare()) // false if trimmed
990	>	else
991		break;
992		}
993		}
994
995		/**
996	<	* Awaits join of the given task if enough threads, or can resume
997	<	* or create a spare. Fails (in which case the given task might
914	<	* not be done) upon contention or lack of decision about
915	<	* blocking. Returns void because caller must check
916	<	* task status on return anyway.
917	<	*
918	<	* We allow blocking if:
919	<	*
920	<	* 1. There would still be at least as many running threads as
921	<	* parallelism level if this thread blocks.
922	<	*
923	<	* 2. A spare is resumed to replace this worker. We tolerate
924	<	* slop in the decision to replace if a spare is found without
925	<	* first decrementing run count. This may release too many,
926	<	* but if so, the superfluous ones will re-suspend via
927	<	* preStep().
928	<	*
929	<	* 3. After #spares repeated checks, there are no fewer than #spare
930	<	* threads not running. We allow this slack to avoid hysteresis
931	<	* and as a hedge against lag/uncertainty of running count
932	<	* estimates when signalling or unblocking stalls.
933	<	*
934	<	* 4. All existing workers are busy (as rechecked via repeated
935	<	* retries by caller) and a new spare is created.
936	<	*
937	<	* If none of the above hold, we try to escape out by
938	<	* re-incrementing count and returning to caller, which can retry
939	<	* later.
996	>	* Helps and/or blocks awaiting join of the given task.
997	>	* See above for explanation.
998		*
999		* @param joinMe the task to join
1000	<	* @param retries if negative, then serve only as a precheck
943	<	* that the thread can be replaced by a spare. Otherwise,
944	<	* the number of repeated calls to this method returning busy
945	<	* @return true if the call must be retried because there
946	<	* none of the blocking checks hold
1000	>	* @param worker the current worker thread
1001		*/
1002	<	final boolean tryAwaitJoin(ForkJoinTask<?> joinMe, int retries) {
1003	<	if (joinMe.status < 0) // precheck for cancellation
1004	<	return false;
1005	<	if ((runState & TERMINATING) != 0) { // shutting down
1006	<	joinMe.cancelIgnoringExceptions();
1007	<	return false;
1008	<	}
1009	<
1010	<	int pc = parallelism;
1011	<	boolean running = true; // false when running count decremented
1012	<	outer:for (;;) {
1013	<	int wc = workerCounts;
1014	<	int rc = wc & RUNNING_COUNT_MASK;
1015	<	int tc = wc >>> TOTAL_COUNT_SHIFT;
1016	<	if (running) { // replace with spare or decrement count
1017	<	if (rc <= pc && tc > pc &&
1018	<	(retries > 0 \|\| tc > (runState & ACTIVE_COUNT_MASK))) {
1019	<	ForkJoinWorkerThread[] ws = workers;
1020	<	int nws = ws.length;
1021	<	for (int i = 0; i < nws; ++i) { // search for spare
1022	<	ForkJoinWorkerThread w = ws[i];
1023	<	if (w != null) {
1024	<	if (joinMe.status < 0)
1025	<	return false;
1026	<	if (w.isSuspended()) {
1027	<	if ((workerCounts & RUNNING_COUNT_MASK)>=pc &&
1028	<	w.tryResumeSpare()) {
975	<	running = false;
976	<	break outer;
977	<	}
978	<	continue outer; // rescan
979	<	}
980	<	}
981	<	}
982	<	}
983	<	if (retries < 0 \|\| // < 0 means replacement check only
984	<	rc == 0 \|\| joinMe.status < 0 \|\| workerCounts != wc \|\|
985	<	!UNSAFE.compareAndSwapInt(this, workerCountsOffset,
986	<	wc, wc - ONE_RUNNING))
987	<	return false; // done or inconsistent or contended
988	<	running = false;
989	<	if (rc > pc)
990	<	break;
991	<	}
992	<	else { // allow blocking if enough threads
993	<	if (rc >= pc \|\| joinMe.status < 0)
994	<	break;
995	<	int sc = tc - pc + 1; // = spare threads, plus the one to add
996	<	if (retries > sc) {
997	<	if (rc > 0 && rc >= pc - sc) // allow slack
998	<	break;
999	<	if (tc < MAX_THREADS &&
1000	<	tc == (runState & ACTIVE_COUNT_MASK) &&
1001	<	workerCounts == wc &&
1002	<	UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
1003	<	wc+(ONE_RUNNING\|ONE_TOTAL))) {
1004	<	addWorker();
1005	<	break;
1006	<	}
1007	<	}
1008	<	if (workerCounts == wc && // back out to allow rescan
1009	<	UNSAFE.compareAndSwapInt (this, workerCountsOffset,
1010	<	wc, wc + ONE_RUNNING)) {
1011	<	releaseWaiters(); // help others progress
1012	<	return true; // let caller retry
1013	<	}
1002	>	final void awaitJoin(ForkJoinTask<?> joinMe, ForkJoinWorkerThread worker) {
1003	>	int retries = 2 + (parallelism >> 2); // #helpJoins before blocking
1004	>	while (joinMe.status >= 0) {
1005	>	int wc;
1006	>	worker.helpJoinTask(joinMe);
1007	>	if (joinMe.status < 0)
1008	>	break;
1009	>	else if (retries > 0)
1010	>	--retries;
1011	>	else if (((wc = workerCounts) & RUNNING_COUNT_MASK) != 0 &&
1012	>	UNSAFE.compareAndSwapInt(this, workerCountsOffset,
1013	>	wc, wc - ONE_RUNNING)) {
1014	>	int stat, c; long h;
1015	>	while ((stat = joinMe.status) >= 0 &&
1016	>	(h = eventWaiters) != 0L && // help release others
1017	>	(int)(h >>> EVENT_COUNT_SHIFT) != eventCount)
1018	>	releaseEventWaiters();
1019	>	if (stat >= 0 &&
1020	>	((workerCounts & RUNNING_COUNT_MASK) == 0 \|\|
1021	>	(stat =
1022	>	joinMe.internalAwaitDone(JOIN_TIMEOUT_MILLIS)) >= 0))
1023	>	helpMaintainParallelism(); // timeout or no running workers
1024	>	do {} while (!UNSAFE.compareAndSwapInt
1025	>	(this, workerCountsOffset,
1026	>	c = workerCounts, c + ONE_RUNNING));
1027	>	if (stat < 0)
1028	>	break; // else restart
1029		}
1030		}
1016	–	// arrive here if can block
1017	–	joinMe.internalAwaitDone();
1018	–	int c; // to inline incrementRunningCount
1019	–	do {} while (!UNSAFE.compareAndSwapInt
1020	–	(this, workerCountsOffset,
1021	–	c = workerCounts, c + ONE_RUNNING));
1022	–	return false;
1031		}
1032
1033		/**
1034	<	* Same idea as (and shares many code snippets with) tryAwaitJoin,
1027	<	* but self-contained because there are no caller retries.
1028	<	* TODO: Rework to use simpler API.
1034	>	* Same idea as awaitJoin, but no helping, retries, or timeouts.
1035		*/
1036		final void awaitBlocker(ManagedBlocker blocker)
1037		throws InterruptedException {
1038	<	boolean done;
1033	<	if (done = blocker.isReleasable())
1034	<	return;
1035	<	int pc = parallelism;
1036	<	int retries = 0;
1037	<	boolean running = true; // false when running count decremented
1038	<	outer:for (;;) {
1038	>	while (!blocker.isReleasable()) {
1039		int wc = workerCounts;
1040	<	int rc = wc & RUNNING_COUNT_MASK;
1041	<	int tc = wc >>> TOTAL_COUNT_SHIFT;
1042	<	if (running) {
1043	<	if (rc <= pc && tc > pc &&
1044	<	(retries > 0 \|\| tc > (runState & ACTIVE_COUNT_MASK))) {
1045	<	ForkJoinWorkerThread[] ws = workers;
1046	<	int nws = ws.length;
1047	<	for (int i = 0; i < nws; ++i) {
1048	<	ForkJoinWorkerThread w = ws[i];
1049	<	if (w != null) {
1050	<	if (done = blocker.isReleasable())
1051	<	return;
1052	<	if (w.isSuspended()) {
1053	<	if ((workerCounts & RUNNING_COUNT_MASK)>=pc &&
1054	<	w.tryResumeSpare()) {
1055	<	running = false;
1056	<	break outer;
1057	<	}
1058	<	continue outer; // rescan
1059	<	}
1060	<	}
1061	<	}
1062	<	}
1063	<	if (done = blocker.isReleasable())
1064	<	return;
1065	<	if (rc == 0 \|\| workerCounts != wc \|\|
1066	<	!UNSAFE.compareAndSwapInt(this, workerCountsOffset,
1067	<	wc, wc - ONE_RUNNING))
1068	<	continue;
1069	<	running = false;
1070	<	if (rc > pc)
1071	<	break;
1072	<	}
1073	<	else {
1074	<	if (rc >= pc \|\| (done = blocker.isReleasable()))
1075	<	break;
1076	<	int sc = tc - pc + 1;
1077	<	if (retries++ > sc) {
1078	<	if (rc > 0 && rc >= pc - sc)
1079	<	break;
1080	<	if (tc < MAX_THREADS &&
1081	<	tc == (runState & ACTIVE_COUNT_MASK) &&
1082	<	workerCounts == wc &&
1083	<	UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
1084	<	wc+(ONE_RUNNING\|ONE_TOTAL))) {
1085	<	addWorker();
1086	<	break;
1040	>	if ((wc & RUNNING_COUNT_MASK) != 0 &&
1041	>	UNSAFE.compareAndSwapInt(this, workerCountsOffset,
1042	>	wc, wc - ONE_RUNNING)) {
1043	>	try {
1044	>	while (!blocker.isReleasable()) {
1045	>	long h = eventWaiters;
1046	>	if (h != 0L &&
1047	>	(int)(h >>> EVENT_COUNT_SHIFT) != eventCount)
1048	>	releaseEventWaiters();
1049	>	else if ((workerCounts & RUNNING_COUNT_MASK) == 0 &&
1050	>	runState < TERMINATING)
1051	>	helpMaintainParallelism();
1052	>	else if (blocker.block())
1053	>	break;
1054		}
1055	+	} finally {
1056	+	int c;
1057	+	do {} while (!UNSAFE.compareAndSwapInt
1058	+	(this, workerCountsOffset,
1059	+	c = workerCounts, c + ONE_RUNNING));
1060		}
1061	<	Thread.yield();
1090	<	}
1091	<	}
1092	<
1093	<	try {
1094	<	if (!done)
1095	<	do {} while (!blocker.isReleasable() && !blocker.block());
1096	<	} finally {
1097	<	if (!running) {
1098	<	int c;
1099	<	do {} while (!UNSAFE.compareAndSwapInt
1100	<	(this, workerCountsOffset,
1101	<	c = workerCounts, c + ONE_RUNNING));
1061	>	break;
1062		}
1063		}
1064		}
#	Line 1131 \| Line 1091 \| public class ForkJoinPool extends Abstra
1091
1092		/**
1093		* Actions on transition to TERMINATING
1094	+	*
1095	+	* Runs up to four passes through workers: (0) shutting down each
1096	+	* (without waking up if parked) to quickly spread notifications
1097	+	* without unnecessary bouncing around event queues etc (1) wake
1098	+	* up and help cancel tasks (2) interrupt (3) mop up races with
1099	+	* interrupted workers
1100		*/
1101		private void startTerminating() {
1102	<	for (int i = 0; i < 2; ++i) { // twice to mop up newly created workers
1103	<	cancelSubmissions();
1104	<	shutdownWorkers();
1105	<	cancelWorkerTasks();
1106	<	signalEvent();
1107	<	interruptWorkers();
1102	>	cancelSubmissions();
1103	>	for (int passes = 0; passes < 4 && workerCounts != 0; ++passes) {
1104	>	int c; // advance event count
1105	>	UNSAFE.compareAndSwapInt(this, eventCountOffset,
1106	>	c = eventCount, c+1);
1107	>	eventWaiters = 0L; // clobber lists
1108	>	spareWaiters = 0;
1109	>	ForkJoinWorkerThread[] ws = workers;
1110	>	int n = ws.length;
1111	>	for (int i = 0; i < n; ++i) {
1112	>	ForkJoinWorkerThread w = ws[i];
1113	>	if (w != null) {
1114	>	w.shutdown();
1115	>	if (passes > 0 && !w.isTerminated()) {
1116	>	w.cancelTasks();
1117	>	LockSupport.unpark(w);
1118	>	if (passes > 1) {
1119	>	try {
1120	>	w.interrupt();
1121	>	} catch (SecurityException ignore) {
1122	>	}
1123	>	}
1124	>	}
1125	>	}
1126	>	}
1127		}
1128		}
1129
#	Line 1155 \| Line 1140 \| public class ForkJoinPool extends Abstra
1140		}
1141		}
1142
1158	–	/**
1159	–	* Sets all worker run states to at least shutdown,
1160	–	* also resuming suspended workers
1161	–	*/
1162	–	private void shutdownWorkers() {
1163	–	ForkJoinWorkerThread[] ws = workers;
1164	–	int nws = ws.length;
1165	–	for (int i = 0; i < nws; ++i) {
1166	–	ForkJoinWorkerThread w = ws[i];
1167	–	if (w != null)
1168	–	w.shutdown();
1169	–	}
1170	–	}
1171	–
1172	–	/**
1173	–	* Clears out and cancels all locally queued tasks
1174	–	*/
1175	–	private void cancelWorkerTasks() {
1176	–	ForkJoinWorkerThread[] ws = workers;
1177	–	int nws = ws.length;
1178	–	for (int i = 0; i < nws; ++i) {
1179	–	ForkJoinWorkerThread w = ws[i];
1180	–	if (w != null)
1181	–	w.cancelTasks();
1182	–	}
1183	–	}
1184	–
1185	–	/**
1186	–	* Unsticks all workers blocked on joins etc
1187	–	*/
1188	–	private void interruptWorkers() {
1189	–	ForkJoinWorkerThread[] ws = workers;
1190	–	int nws = ws.length;
1191	–	for (int i = 0; i < nws; ++i) {
1192	–	ForkJoinWorkerThread w = ws[i];
1193	–	if (w != null && !w.isTerminated()) {
1194	–	try {
1195	–	w.interrupt();
1196	–	} catch (SecurityException ignore) {
1197	–	}
1198	–	}
1199	–	}
1200	–	}
1201	–
1143		// misc support for ForkJoinWorkerThread
1144
1145		/**
#	Line 1209 \| Line 1150 \| public class ForkJoinPool extends Abstra
1150		}
1151
1152		/**
1153	<	* Accumulates steal count from a worker, clearing
1154	<	* the worker's value
1153	>	* Tries to accumulates steal count from a worker, clearing
1154	>	* the worker's value.
1155	>	*
1156	>	* @return true if worker steal count now zero
1157		*/
1158	<	final void accumulateStealCount(ForkJoinWorkerThread w) {
1158	>	final boolean tryAccumulateStealCount(ForkJoinWorkerThread w) {
1159		int sc = w.stealCount;
1160	<	if (sc != 0) {
1161	<	long c;
1162	<	w.stealCount = 0;
1163	<	do {} while (!UNSAFE.compareAndSwapLong(this, stealCountOffset,
1164	<	c = stealCount, c + sc));
1160	>	long c = stealCount;
1161	>	// CAS even if zero, for fence effects
1162	>	if (UNSAFE.compareAndSwapLong(this, stealCountOffset, c, c + sc)) {
1163	>	if (sc != 0)
1164	>	w.stealCount = 0;
1165	>	return true;
1166		}
1167	+	return sc == 0;
1168		}
1169
1170		/**
#	Line 1228 \| Line 1173 \| public class ForkJoinPool extends Abstra
1173		*/
1174		final int idlePerActive() {
1175		int pc = parallelism; // use parallelism, not rc
1176	<	int ac = runState; // no mask -- artifically boosts during shutdown
1176	>	int ac = runState; // no mask -- artificially boosts during shutdown
1177		// Use exact results for small values, saturate past 4
1178		return pc <= ac? 0 : pc >>> 1 <= ac? 1 : pc >>> 2 <= ac? 3 : pc >>> 3;
1179		}
#	Line 1302 \| Line 1247 \| public class ForkJoinPool extends Abstra
1247		checkPermission();
1248		if (factory == null)
1249		throw new NullPointerException();
1250	<	if (parallelism <= 0 \|\| parallelism > MAX_THREADS)
1250	>	if (parallelism <= 0 \|\| parallelism > MAX_WORKERS)
1251		throw new IllegalArgumentException();
1252		this.parallelism = parallelism;
1253		this.factory = factory;
#	Line 1321 \| Line 1266 \| public class ForkJoinPool extends Abstra
1266		* @param pc the initial parallelism level
1267		*/
1268		private static int initialArraySizeFor(int pc) {
1269	<	// See Hackers Delight, sec 3.2. We know MAX_THREADS < (1 >>> 16)
1270	<	int size = pc < MAX_THREADS ? pc + 1 : MAX_THREADS;
1269	>	// If possible, initially allocate enough space for one spare
1270	>	int size = pc < MAX_WORKERS ? pc + 1 : MAX_WORKERS;
1271	>	// See Hackers Delight, sec 3.2. We know MAX_WORKERS < (1 >>> 16)
1272		size \|= size >>> 1;
1273		size \|= size >>> 2;
1274		size \|= size >>> 4;
#	Line 1341 \| Line 1287 \| public class ForkJoinPool extends Abstra
1287		if (runState >= SHUTDOWN)
1288		throw new RejectedExecutionException();
1289		submissionQueue.offer(task);
1290	<	signalEvent();
1291	<	ensureEnoughWorkers();
1290	>	int c; // try to increment event count -- CAS failure OK
1291	>	UNSAFE.compareAndSwapInt(this, eventCountOffset, c = eventCount, c+1);
1292	>	helpMaintainParallelism(); // create, start, or resume some workers
1293		}
1294
1295		/**
1296		* Performs the given task, returning its result upon completion.
1350	–	* If the caller is already engaged in a fork/join computation in
1351	–	* the current pool, this method is equivalent in effect to
1352	–	* {@link ForkJoinTask#invoke}.
1297		*
1298		* @param task the task
1299		* @return the task's result
#	Line 1364 \| Line 1308 \| public class ForkJoinPool extends Abstra
1308
1309		/**
1310		* Arranges for (asynchronous) execution of the given task.
1367	–	* If the caller is already engaged in a fork/join computation in
1368	–	* the current pool, this method is equivalent in effect to
1369	–	* {@link ForkJoinTask#fork}.
1311		*
1312		* @param task the task
1313		* @throws NullPointerException if the task is null
#	Line 1395 \| Line 1336 \| public class ForkJoinPool extends Abstra
1336
1337		/**
1338		* Submits a ForkJoinTask for execution.
1398	–	* If the caller is already engaged in a fork/join computation in
1399	–	* the current pool, this method is equivalent in effect to
1400	–	* {@link ForkJoinTask#fork}.
1339		*
1340		* @param task the task to submit
1341		* @return the task
#	Line 1589 \| Line 1527 \| public class ForkJoinPool extends Abstra
1527		public long getQueuedTaskCount() {
1528		long count = 0;
1529		ForkJoinWorkerThread[] ws = workers;
1530	<	int nws = ws.length;
1531	<	for (int i = 0; i < nws; ++i) {
1530	>	int n = ws.length;
1531	>	for (int i = 0; i < n; ++i) {
1532		ForkJoinWorkerThread w = ws[i];
1533		if (w != null)
1534		count += w.getQueueSize();
#	Line 1648 \| Line 1586 \| public class ForkJoinPool extends Abstra
1586		* @return the number of elements transferred
1587		*/
1588		protected int drainTasksTo(Collection<? super ForkJoinTask<?>> c) {
1589	<	int n = submissionQueue.drainTo(c);
1652	<	ForkJoinWorkerThread[] ws = workers;
1653	<	int nws = ws.length;
1654	<	for (int i = 0; i < nws; ++i) {
1655	<	ForkJoinWorkerThread w = ws[i];
1656	<	if (w != null)
1657	<	n += w.drainTasksTo(c);
1658	<	}
1659	<	return n;
1660	<	}
1661	<
1662	<	/**
1663	<	* Returns count of total parks by existing workers.
1664	<	* Used during development only since not meaningful to users.
1665	<	*/
1666	<	private int collectParkCount() {
1667	<	int count = 0;
1589	>	int count = submissionQueue.drainTo(c);
1590		ForkJoinWorkerThread[] ws = workers;
1591	<	int nws = ws.length;
1592	<	for (int i = 0; i < nws; ++i) {
1591	>	int n = ws.length;
1592	>	for (int i = 0; i < n; ++i) {
1593		ForkJoinWorkerThread w = ws[i];
1594		if (w != null)
1595	<	count += w.parkCount;
1595	>	count += w.drainTasksTo(c);
1596		}
1597		return count;
1598		}
#	Line 1692 \| Line 1614 \| public class ForkJoinPool extends Abstra
1614		int pc = parallelism;
1615		int rs = runState;
1616		int ac = rs & ACTIVE_COUNT_MASK;
1695	–	// int pk = collectParkCount();
1617		return super.toString() +
1618		"[" + runLevelToString(rs) +
1619		", parallelism = " + pc +
#	Line 1702 \| Line 1623 \| public class ForkJoinPool extends Abstra
1623		", steals = " + st +
1624		", tasks = " + qt +
1625		", submissions = " + qs +
1705	–	// ", parks = " + pk +
1626		"]";
1627		}
1628
#	Line 1800 \| Line 1720 \| public class ForkJoinPool extends Abstra
1720		throws InterruptedException {
1721		try {
1722		return termination.awaitAdvanceInterruptibly(0, timeout, unit) > 0;
1723	<	} catch(TimeoutException ex) {
1723	>	} catch (TimeoutException ex) {
1724		return false;
1725		}
1726		}
#	Line 1809 \| Line 1729 \| public class ForkJoinPool extends Abstra
1729		* Interface for extending managed parallelism for tasks running
1730		* in {@link ForkJoinPool}s.
1731		*
1732	<	* <p>A {@code ManagedBlocker} provides two methods.
1733	<	* Method {@code isReleasable} must return {@code true} if
1734	<	* blocking is not necessary. Method {@code block} blocks the
1735	<	* current thread if necessary (perhaps internally invoking
1736	<	* {@code isReleasable} before actually blocking).
1732	>	* <p>A {@code ManagedBlocker} provides two methods. Method
1733	>	* {@code isReleasable} must return {@code true} if blocking is
1734	>	* not necessary. Method {@code block} blocks the current thread
1735	>	* if necessary (perhaps internally invoking {@code isReleasable}
1736	>	* before actually blocking). The unusual methods in this API
1737	>	* accommodate synchronizers that may, but don't usually, block
1738	>	* for long periods. Similarly, they allow more efficient internal
1739	>	* handling of cases in which additional workers may be, but
1740	>	* usually are not, needed to ensure sufficient parallelism.
1741	>	* Toward this end, implementations of method {@code isReleasable}
1742	>	* must be amenable to repeated invocation.
1743		*
1744		* <p>For example, here is a ManagedBlocker based on a
1745		* ReentrantLock:
#	Line 1831 \| Line 1757 \| public class ForkJoinPool extends Abstra
1757		* return hasLock \|\| (hasLock = lock.tryLock());
1758		* }
1759		* }}</pre>
1760	+	*
1761	+	* <p>Here is a class that possibly blocks waiting for an
1762	+	* item on a given queue:
1763	+	* <pre> {@code
1764	+	* class QueueTaker<E> implements ManagedBlocker {
1765	+	* final BlockingQueue<E> queue;
1766	+	* volatile E item = null;
1767	+	* QueueTaker(BlockingQueue<E> q) { this.queue = q; }
1768	+	* public boolean block() throws InterruptedException {
1769	+	* if (item == null)
1770	+	* item = queue.take();
1771	+	* return true;
1772	+	* }
1773	+	* public boolean isReleasable() {
1774	+	* return item != null \|\| (item = queue.poll()) != null;
1775	+	* }
1776	+	* public E getItem() { // call after pool.managedBlock completes
1777	+	* return item;
1778	+	* }
1779	+	* }}</pre>
1780		*/
1781		public static interface ManagedBlocker {
1782		/**
#	Line 1873 \| Line 1819 \| public class ForkJoinPool extends Abstra
1819		public static void managedBlock(ManagedBlocker blocker)
1820		throws InterruptedException {
1821		Thread t = Thread.currentThread();
1822	<	if (t instanceof ForkJoinWorkerThread)
1823	<	((ForkJoinWorkerThread) t).pool.awaitBlocker(blocker);
1822	>	if (t instanceof ForkJoinWorkerThread) {
1823	>	ForkJoinWorkerThread w = (ForkJoinWorkerThread) t;
1824	>	w.pool.awaitBlocker(blocker);
1825	>	}
1826		else {
1827		do {} while (!blocker.isReleasable() && !blocker.block());
1828		}
#	Line 1905 \| Line 1853 \| public class ForkJoinPool extends Abstra
1853		objectFieldOffset("eventWaiters",ForkJoinPool.class);
1854		private static final long stealCountOffset =
1855		objectFieldOffset("stealCount",ForkJoinPool.class);
1856	+	private static final long spareWaitersOffset =
1857	+	objectFieldOffset("spareWaiters",ForkJoinPool.class);
1858
1859		private static long objectFieldOffset(String field, Class<?> klazz) {
1860		try {

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing jsr166/src/jsr166y/ForkJoinPool.java (file contents): Revision 1.59 by dl, Fri Jul 23 14:09:17 2010 UTC vs. Revision 1.70 by dl, Sat Sep 4 11:33:53 2010 UTC

Diff Legend

Comparing jsr166/src/jsr166y/ForkJoinPool.java (file contents):
Revision 1.59 by dl, Fri Jul 23 14:09:17 2010 UTC vs.
Revision 1.70 by dl, Sat Sep 4 11:33:53 2010 UTC