[ViewVC] Diff of: jsr166/jsr166/src/jsr166y/ForkJoinPool.java

Comparing jsr166/src/jsr166y/ForkJoinPool.java (file contents):
Revision 1.62 by dl, Wed Aug 11 20:28:22 2010 UTC vs.
Revision 1.66 by dl, Sun Aug 29 23:34:46 2010 UTC

#	Line 161 \| Line 161 \| public class ForkJoinPool extends Abstra
161		* re-activate a spare thread to compensate for blocked
162		* joiners until they unblock.
163		*
164	<	* Because the determining existence of conservatively safe
165	<	* helping targets, the availability of already-created spares,
166	<	* and the apparent need to create new spares are all racy and
167	<	* require heuristic guidance, we rely on multiple retries of
168	<	* each. Further, because it is impossible to keep exactly the
169	<	* target (parallelism) number of threads running at any given
170	<	* time, we allow compensation during joins to fail, and enlist
171	<	* all other threads to help out whenever they are not otherwise
172	<	* occupied (i.e., mainly in method preStep).
164	>	* It is impossible to keep exactly the target (parallelism)
165	>	* number of threads running at any given time. Determining
166	>	* existence of conservatively safe helping targets, the
167	>	* availability of already-created spares, and the apparent need
168	>	* to create new spares are all racy and require heuristic
169	>	* guidance, so we rely on multiple retries of each. Compensation
170	>	* occurs in slow-motion. It is triggered only upon timeouts of
171	>	* Object.wait used for joins. This reduces poor decisions that
172	>	* would otherwise be made when threads are waiting for others
173	>	* that are stalled because of unrelated activities such as
174	>	* garbage collection.
175		*
176		* The ManagedBlocker extension API can't use helping so relies
177		* only on compensation in method awaitBlocker.
#	Line 259 \| Line 261 \| public class ForkJoinPool extends Abstra
261		* workers that previously could not find a task to now find one:
262		* Submission of a new task to the pool, or another worker pushing
263		* a task onto a previously empty queue. (We also use this
264	<	* mechanism for termination actions that require wakeups of idle
265	<	* workers). Each worker maintains its last known event count,
266	<	* and blocks when a scan for work did not find a task AND its
267	<	* lastEventCount matches the current eventCount. Waiting idle
268	<	* workers are recorded in a variant of Treiber stack headed by
269	<	* field eventWaiters which, when nonzero, encodes the thread
270	<	* index and count awaited for by the worker thread most recently
271	<	* calling eventSync. This thread in turn has a record (field
272	<	* nextEventWaiter) for the next waiting worker. In addition to
273	<	* allowing simpler decisions about need for wakeup, the event
274	<	* count bits in eventWaiters serve the role of tags to avoid ABA
275	<	* errors in Treiber stacks. To reduce delays in task diffusion,
276	<	* workers not otherwise occupied may invoke method
277	<	* releaseEventWaiters, that removes and signals (unparks) workers
278	<	* not waiting on current count. To reduce stalls, To minimize
279	<	* task production stalls associate with signalling, any worker
280	<	* pushing a task on an empty queue invokes the weaker method
279	<	* signalWork, that only releases idle workers until it detects
280	<	* interference by other threads trying to release, and lets them
281	<	* take over. The net effect is a tree-like diffusion of signals,
282	<	* where released threads (and possibly others) help with unparks.
283	<	* To further reduce contention effects a bit, failed CASes to
284	<	* increment field eventCount are tolerated without retries.
264	>	* mechanism for configuration and termination actions that
265	>	* require wakeups of idle workers). Each worker maintains its
266	>	* last known event count, and blocks when a scan for work did not
267	>	* find a task AND its lastEventCount matches the current
268	>	* eventCount. Waiting idle workers are recorded in a variant of
269	>	* Treiber stack headed by field eventWaiters which, when nonzero,
270	>	* encodes the thread index and count awaited for by the worker
271	>	* thread most recently calling eventSync. This thread in turn has
272	>	* a record (field nextEventWaiter) for the next waiting worker.
273	>	* In addition to allowing simpler decisions about need for
274	>	* wakeup, the event count bits in eventWaiters serve the role of
275	>	* tags to avoid ABA errors in Treiber stacks. Upon any wakeup,
276	>	* released threads also try to release at most two others. The
277	>	* net effect is a tree-like diffusion of signals, where released
278	>	* threads (and possibly others) help with unparks. To further
279	>	* reduce contention effects a bit, failed CASes to increment
280	>	* field eventCount are tolerated without retries in signalWork.
281		* Conceptually they are merged into the same event, which is OK
282		* when their only purpose is to enable workers to scan for work.
283		*
284	<	* 5. Managing suspension of extra workers. When a worker is about
285	<	* to block waiting for a join (or via ManagedBlockers), we may
286	<	* create a new thread to maintain parallelism level, or at least
287	<	* avoid starvation. Usually, extra threads are needed for only
288	<	* very short periods, yet join dependencies are such that we
289	<	* sometimes need them in bursts. Rather than create new threads
290	<	* each time this happens, we suspend no-longer-needed extra ones
291	<	* as "spares". For most purposes, we don't distinguish "extra"
292	<	* spare threads from normal "core" threads: On each call to
293	<	* preStep (the only point at which we can do this) a worker
294	<	* checks to see if there are now too many running workers, and if
295	<	* so, suspends itself. Method helpMaintainParallelism looks for
296	<	* suspended threads to resume before considering creating a new
297	<	* replacement. The spares themselves are encoded on another
298	<	* variant of a Treiber Stack, headed at field "spareWaiters".
299	<	* Note that the use of spares is intrinsically racy. One thread
300	<	* may become a spare at about the same time as another is
301	<	* needlessly being created. We counteract this and related slop
302	<	* in part by requiring resumed spares to immediately recheck (in
303	<	* preStep) to see whether they they should re-suspend. To avoid
304	<	* long-term build-up of spares, the oldest spare (see
305	<	* ForkJoinWorkerThread.suspendAsSpare) occasionally wakes up if
306	<	* not signalled and calls tryTrimSpare, which uses two different
307	<	* thresholds: Always killing if the number of spares is greater
308	<	* that 25% of total, and killing others only at a slower rate
309	<	* (UNUSED_SPARE_TRIM_RATE_NANOS).
284	>	* 5. Managing suspension of extra workers. When a worker notices
285	>	* (usually upon timeout of a wait()) that there are too few
286	>	* running threads, we may create a new thread to maintain
287	>	* parallelism level, or at least avoid starvation. Usually, extra
288	>	* threads are needed for only very short periods, yet join
289	>	* dependencies are such that we sometimes need them in
290	>	* bursts. Rather than create new threads each time this happens,
291	>	* we suspend no-longer-needed extra ones as "spares". For most
292	>	* purposes, we don't distinguish "extra" spare threads from
293	>	* normal "core" threads: On each call to preStep (the only point
294	>	* at which we can do this) a worker checks to see if there are
295	>	* now too many running workers, and if so, suspends itself.
296	>	* Method helpMaintainParallelism looks for suspended threads to
297	>	* resume before considering creating a new replacement. The
298	>	* spares themselves are encoded on another variant of a Treiber
299	>	* Stack, headed at field "spareWaiters". Note that the use of
300	>	* spares is intrinsically racy. One thread may become a spare at
301	>	* about the same time as another is needlessly being created. We
302	>	* counteract this and related slop in part by requiring resumed
303	>	* spares to immediately recheck (in preStep) to see whether they
304	>	* they should re-suspend.
305	>	*
306	>	* 6. Killing off unneeded workers. A timeout mechanism is used to
307	>	* shed unused workers: The oldest (first) event queue waiter uses
308	>	* a timed rather than hard wait. When this wait times out without
309	>	* a normal wakeup, it tries to shutdown any one (for convenience
310	>	* the newest) other spare or event waiter via
311	>	* tryShutdownUnusedWorker. This eventually reduces the number of
312	>	* worker threads to a minimum of one after a long enough period
313	>	* without use.
314		*
315	<	* 6. Deciding when to create new workers. The main dynamic
315	>	* 7. Deciding when to create new workers. The main dynamic
316		* control in this class is deciding when to create extra threads
317		* in method helpMaintainParallelism. We would like to keep
318		* exactly #parallelism threads running, which is an impossble
#	Line 323 \| Line 323 \| public class ForkJoinPool extends Abstra
323		* compilation, and wake-up lags. These transients are extremely
324		* common -- we are normally trying to fully saturate the CPUs on
325		* a machine, so almost any activity other than running tasks
326	<	* impedes accuracy. Our main defense is to allow some slack in
327	<	* creation thresholds, using rules that reflect the fact that the
328	<	* more threads we have running, the more likely that we are
329	<	* underestimating the number running threads. The rules also
330	<	* better cope with the fact that some of the methods in this
331	<	* class tend to never become compiled (but are interpreted), so
332	<	* some components of the entire set of controls might execute 100
333	<	* times faster than others. And similarly for cases where the
334	<	* apparent lack of work is just due to GC stalls and other
335	<	* transient system activity.
326	>	* impedes accuracy. Our main defense is to allow parallelism to
327	>	* lapse for a while during joins, and use a timeout to see if,
328	>	* after the resulting settling, there is still a need for
329	>	* additional workers. This also better copes with the fact that
330	>	* some of the methods in this class tend to never become compiled
331	>	* (but are interpreted), so some components of the entire set of
332	>	* controls might execute 100 times faster than others. And
333	>	* similarly for cases where the apparent lack of work is just due
334	>	* to GC stalls and other transient system activity.
335		*
336		* Beware that there is a lot of representation-level coupling
337		* among classes ForkJoinPool, ForkJoinWorkerThread, and
#	Line 419 \| Line 418 \| public class ForkJoinPool extends Abstra
418		new AtomicInteger();
419
420		/**
421	+	* The time to block in a join (see awaitJoin) before checking if
422	+	* a new worker should be (re)started to maintain parallelism
423	+	* level. The value should be short enough to maintain gloabal
424	+	* responsiveness and progress but long enough to avoid
425	+	* counterproductive firings during GC stalls or unrelated system
426	+	* activity, and to not bog down systems with continual re-firings
427	+	* on GCs or legitimately long waits.
428	+	*/
429	+	private static final long JOIN_TIMEOUT_MILLIS = 250L; // 4 per second
430	+
431	+	/**
432	+	* The wakeup interval (in nanoseconds) for the oldest worker
433	+	* worker waiting for an event invokes tryShutdownUnusedWorker to shrink
434	+	* the number of workers. The exact value does not matter too
435	+	* much, but should be long enough to slowly release resources
436	+	* during long periods without use without disrupting normal use.
437	+	*/
438	+	private static final long SHRINK_RATE_NANOS =
439	+	30L * 1000L * 1000L * 1000L; // 2 per minute
440	+
441	+	/**
442		* Absolute bound for parallelism level. Twice this number plus
443		* one (i.e., 0xfff) must fit into a 16bit field to enable
444		* word-packing for some counts and indices.
#	Line 463 \| Line 483 \| public class ForkJoinPool extends Abstra
483		private volatile long stealCount;
484
485		/**
466	–	* The last nanoTime that a spare thread was trimmed
467	–	*/
468	–	private volatile long trimTime;
469	–
470	–	/**
471	–	* The rate at which to trim unused spares
472	–	*/
473	–	static final long UNUSED_SPARE_TRIM_RATE_NANOS =
474	–	1000L * 1000L * 1000L; // 1 sec
475	–
476	–	/**
486		* Encoded record of top of treiber stack of threads waiting for
487		* events. The top 32 bits contain the count being waited for. The
488		* bottom 16 bits contains one plus the pool index of waiting
#	Line 514 \| Line 523 \| public class ForkJoinPool extends Abstra
523		* These are bundled together to ensure consistent read for
524		* termination checks (i.e., that runLevel is at least SHUTDOWN
525		* and active threads is zero).
526	+	*
527	+	* Notes: Most direct CASes are dependent on these bitfield
528	+	* positions. Also, this field is non-private to enable direct
529	+	* performance-sensitive CASes in ForkJoinWorkerThread.
530		*/
531	<	private volatile int runState;
531	>	volatile int runState;
532
533		// Note: The order among run level values matters.
534		private static final int RUNLEVEL_SHIFT = 16;
#	Line 523 \| Line 536 \| public class ForkJoinPool extends Abstra
536		private static final int TERMINATING = 1 << (RUNLEVEL_SHIFT + 1);
537		private static final int TERMINATED = 1 << (RUNLEVEL_SHIFT + 2);
538		private static final int ACTIVE_COUNT_MASK = (1 << RUNLEVEL_SHIFT) - 1;
526	–	private static final int ONE_ACTIVE = 1; // active update delta
539
540		/**
541		* Holds number of total (i.e., created and not yet terminated)
#	Line 564 \| Line 576 \| public class ForkJoinPool extends Abstra
576		*/
577		private final int poolNumber;
578
579	<
580	<	// Utilities for CASing fields. Note that several of these
569	<	// are manually inlined by callers
579	>	// Utilities for CASing fields. Note that most of these
580	>	// are usually manually inlined by callers
581
582		/**
583		* Increments running count part of workerCounts
#	Line 599 \| Line 610 \| public class ForkJoinPool extends Abstra
610		private void decrementWorkerCounts(int dr, int dt) {
611		for (;;) {
612		int wc = workerCounts;
602	–	if (wc == 0 && (runState & TERMINATED) != 0)
603	–	return; // lagging termination on a backout
613		if ((wc & RUNNING_COUNT_MASK) - dr < 0 \|\|
614	<	(wc >>> TOTAL_COUNT_SHIFT) - dt < 0)
614	>	(wc >>> TOTAL_COUNT_SHIFT) - dt < 0) {
615	>	if ((runState & TERMINATED) != 0)
616	>	return; // lagging termination on a backout
617		Thread.yield();
618	+	}
619		if (UNSAFE.compareAndSwapInt(this, workerCountsOffset,
620		wc, wc - (dr + dt)))
621		return;
#	Line 611 \| Line 623 \| public class ForkJoinPool extends Abstra
623		}
624
625		/**
614	–	* Increments event count
615	–	*/
616	–	private void advanceEventCount() {
617	–	int c;
618	–	do {} while(!UNSAFE.compareAndSwapInt(this, eventCountOffset,
619	–	c = eventCount, c+1));
620	–	}
621	–
622	–	/**
623	–	* Tries incrementing active count; fails on contention.
624	–	* Called by workers before executing tasks.
625	–	*
626	–	* @return true on success
627	–	*/
628	–	final boolean tryIncrementActiveCount() {
629	–	int c;
630	–	return UNSAFE.compareAndSwapInt(this, runStateOffset,
631	–	c = runState, c + ONE_ACTIVE);
632	–	}
633	–
634	–	/**
626		* Tries decrementing active count; fails on contention.
627		* Called when workers cannot find tasks to run.
628		*/
629		final boolean tryDecrementActiveCount() {
630		int c;
631		return UNSAFE.compareAndSwapInt(this, runStateOffset,
632	<	c = runState, c - ONE_ACTIVE);
632	>	c = runState, c - 1);
633		}
634
635		/**
#	Line 699 \| Line 690 \| public class ForkJoinPool extends Abstra
690		}
691		}
692
702	–	// adding and removing workers
703	–
704	–	/**
705	–	* Tries to create and add new worker. Assumes that worker counts
706	–	* are already updated to accommodate the worker, so adjusts on
707	–	* failure.
708	–	*/
709	–	private void addWorker() {
710	–	ForkJoinWorkerThread w = null;
711	–	try {
712	–	w = factory.newThread(this);
713	–	} finally { // Adjust on either null or exceptional factory return
714	–	if (w == null) {
715	–	decrementWorkerCounts(ONE_RUNNING, ONE_TOTAL);
716	–	tryTerminate(false); // in case of failure during shutdown
717	–	}
718	–	}
719	–	if (w != null)
720	–	w.start(recordWorker(w), ueh);
721	–	}
722	–
693		/**
694		* Final callback from terminating worker. Removes record of
695		* worker from array, and adjusts counts. If pool is shutting
#	Line 740 \| Line 710 \| public class ForkJoinPool extends Abstra
710		/**
711		* Releases workers blocked on a count not equal to current count.
712		* Normally called after precheck that eventWaiters isn't zero to
713	<	* avoid wasted array checks.
714	<	*
745	<	* @param signalling true if caller is a signalling worker so can
746	<	* exit upon (conservatively) detected contention by other threads
747	<	* who will continue to release
713	>	* avoid wasted array checks. Gives up upon a change in count or
714	>	* upon releasing two workers, letting others take over.
715		*/
716	<	private void releaseEventWaiters(boolean signalling) {
716	>	private void releaseEventWaiters() {
717		ForkJoinWorkerThread[] ws = workers;
718		int n = ws.length;
719	<	long h; // head of stack
720	<	ForkJoinWorkerThread w; int id, ec;
721	<	while ((id = ((int)((h = eventWaiters) & WAITER_ID_MASK)) - 1) >= 0 &&
722	<	(int)(h >>> EVENT_COUNT_SHIFT) != (ec = eventCount) &&
719	>	long h = eventWaiters;
720	>	int ec = eventCount;
721	>	boolean releasedOne = false;
722	>	ForkJoinWorkerThread w; int id;
723	>	while ((id = ((int)(h & WAITER_ID_MASK)) - 1) >= 0 &&
724	>	(int)(h >>> EVENT_COUNT_SHIFT) != ec &&
725		id < n && (w = ws[id]) != null) {
726		if (UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
727	<	h, h = w.nextWaiter))
727	>	h, w.nextWaiter)) {
728		LockSupport.unpark(w);
729	<	if (signalling && (eventCount != ec \|\| eventWaiters != h))
729	>	if (releasedOne) // exit on second release
730	>	break;
731	>	releasedOne = true;
732	>	}
733	>	if (eventCount != ec)
734		break;
735	+	h = eventWaiters;
736		}
737		}
738
#	Line 770 \| Line 744 \| public class ForkJoinPool extends Abstra
744		int c; // try to increment event count -- CAS failure OK
745		UNSAFE.compareAndSwapInt(this, eventCountOffset, c = eventCount, c+1);
746		if (eventWaiters != 0L)
747	<	releaseEventWaiters(true);
747	>	releaseEventWaiters();
748		}
749
750		/**
751	<	* Blocks worker until terminating or event count
752	<	* advances from last value held by worker
751	>	* Adds the given worker to event queue and blocks until
752	>	* terminating or event count advances from the given value
753		*
754		* @param w the calling worker thread
755	+	* @param ec the count
756		*/
757	<	private void eventSync(ForkJoinWorkerThread w) {
758	<	int wec = w.lastEventCount;
784	<	long nh = (((long)wec) << EVENT_COUNT_SHIFT) \| ((long)(w.poolIndex+1));
757	>	private void eventSync(ForkJoinWorkerThread w, int ec) {
758	>	long nh = (((long)ec) << EVENT_COUNT_SHIFT) \| ((long)(w.poolIndex+1));
759		long h;
760		while ((runState < SHUTDOWN \|\| !tryTerminate(false)) &&
761	<	((h = eventWaiters) == 0L \|\|
762	<	(int)(h >>> EVENT_COUNT_SHIFT) == wec) &&
763	<	eventCount == wec) {
761	>	(((int)((h = eventWaiters) & WAITER_ID_MASK)) == 0 \|\|
762	>	(int)(h >>> EVENT_COUNT_SHIFT) == ec) &&
763	>	eventCount == ec) {
764		if (UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
765		w.nextWaiter = h, nh)) {
766	<	while (runState < TERMINATING && eventCount == wec) {
767	<	if (!tryAccumulateStealCount(w)) // transfer while idle
768	<	continue;
769	<	Thread.interrupted(); // clear/ignore interrupt
770	<	if (eventCount != wec)
771	<	break;
766	>	awaitEvent(w, ec);
767	>	break;
768	>	}
769	>	}
770	>	}
771	>
772	>	/**
773	>	* Blocks the given worker (that has already been entered as an
774	>	* event waiter) until terminating or event count advances from
775	>	* the given value. The oldest (first) waiter uses a timed wait to
776	>	* occasionally one-by-one shrink the number of workers (to a
777	>	* minimum of one) if the pool has not been used for extended
778	>	* periods.
779	>	*
780	>	* @param w the calling worker thread
781	>	* @param ec the count
782	>	*/
783	>	private void awaitEvent(ForkJoinWorkerThread w, int ec) {
784	>	while (eventCount == ec) {
785	>	if (tryAccumulateStealCount(w)) { // transfer while idle
786	>	boolean untimed = (w.nextWaiter != 0L \|\|
787	>	(workerCounts & RUNNING_COUNT_MASK) <= 1);
788	>	long startTime = untimed? 0 : System.nanoTime();
789	>	Thread.interrupted(); // clear/ignore interrupt
790	>	if (eventCount != ec \|\| w.runState != 0 \|\|
791	>	runState >= TERMINATING) // recheck after clear
792	>	break;
793	>	if (untimed)
794		LockSupport.park(w);
795	+	else {
796	+	LockSupport.parkNanos(w, SHRINK_RATE_NANOS);
797	+	if (eventCount != ec \|\| w.runState != 0 \|\|
798	+	runState >= TERMINATING)
799	+	break;
800	+	if (System.nanoTime() - startTime >= SHRINK_RATE_NANOS)
801	+	tryShutdownUnusedWorker(ec);
802		}
800	–	break;
803		}
804		}
803	–	w.lastEventCount = eventCount;
805		}
806
807	<	// Maintaining spares
807	>	// Maintaining parallelism
808
809		/**
810		* Pushes worker onto the spare stack
811		*/
812		final void pushSpare(ForkJoinWorkerThread w) {
813	<	int ns = (++w.spareCount << SPARE_COUNT_SHIFT) \| (w.poolIndex+1);
813	>	int ns = (++w.spareCount << SPARE_COUNT_SHIFT) \| (w.poolIndex + 1);
814		do {} while (!UNSAFE.compareAndSwapInt(this, spareWaitersOffset,
815		w.nextSpare = spareWaiters,ns));
816		}
817
818		/**
819	<	* Tries (once) to resume a spare if running count is less than
820	<	* target parallelism. Fails on contention or stale workers.
819	>	* Tries (once) to resume a spare if the number of running
820	>	* threads is less than target.
821		*/
822		private void tryResumeSpare() {
823		int sw, id;
824	+	ForkJoinWorkerThread[] ws = workers;
825	+	int n = ws.length;
826		ForkJoinWorkerThread w;
827	<	ForkJoinWorkerThread[] ws;
828	<	if ((id = ((sw = spareWaiters) & SPARE_ID_MASK) - 1) >= 0 &&
829	<	id < (ws = workers).length && (w = ws[id]) != null &&
827	>	if ((sw = spareWaiters) != 0 &&
828	>	(id = (sw & SPARE_ID_MASK) - 1) >= 0 &&
829	>	id < n && (w = ws[id]) != null &&
830		(workerCounts & RUNNING_COUNT_MASK) < parallelism &&
828	–	eventWaiters == 0L &&
831		spareWaiters == sw &&
832		UNSAFE.compareAndSwapInt(this, spareWaitersOffset,
833	<	sw, w.nextSpare) &&
834	<	w.tryUnsuspend()) {
835	<	int c; // try increment; if contended, finish after unpark
836	<	boolean inc = UNSAFE.compareAndSwapInt(this, workerCountsOffset,
837	<	c = workerCounts,
838	<	c + ONE_RUNNING);
839	<	LockSupport.unpark(w);
840	<	if (!inc) {
841	<	do {} while(!UNSAFE.compareAndSwapInt(this, workerCountsOffset,
840	<	c = workerCounts,
841	<	c + ONE_RUNNING));
842	<	}
833	>	sw, w.nextSpare)) {
834	>	int c; // increment running count before resume
835	>	do {} while(!UNSAFE.compareAndSwapInt
836	>	(this, workerCountsOffset,
837	>	c = workerCounts, c + ONE_RUNNING));
838	>	if (w.tryUnsuspend())
839	>	LockSupport.unpark(w);
840	>	else // back out if w was shutdown
841	>	decrementWorkerCounts(ONE_RUNNING, 0);
842		}
843		}
844
845		/**
846	<	* Callback from oldest spare occasionally waking up. Tries
847	<	* (once) to shutdown a spare if more than 25% spare overage, or
848	<	* if UNUSED_SPARE_TRIM_RATE_NANOS have elapsed and there are at
849	<	* least #parallelism running threads. Note that we don't need CAS
850	<	* or locks here because the method is called only from the oldest
851	<	* suspended spare occasionally waking (and even misfires are OK).
852	<	*
854	<	* @param now the wake up nanoTime of caller
855	<	*/
856	<	final void tryTrimSpare(long now) {
857	<	long lastTrim = trimTime;
858	<	trimTime = now;
859	<	helpMaintainParallelism(); // first, help wake up any needed spares
860	<	int sw, id;
861	<	ForkJoinWorkerThread w;
862	<	ForkJoinWorkerThread[] ws;
846	>	* Tries to increase the number of running workers if below target
847	>	* parallelism: If a spare exists tries to resume it via
848	>	* tryResumeSpare. Otherwise, if not enough total workers or all
849	>	* existing workers are busy, adds a new worker. In all casses also
850	>	* helps wake up releasable workers waiting for work.
851	>	*/
852	>	private void helpMaintainParallelism() {
853		int pc = parallelism;
854	<	int wc = workerCounts;
855	<	if ((wc & RUNNING_COUNT_MASK) >= pc &&
856	<	(((wc >>> TOTAL_COUNT_SHIFT) - pc) > (pc >>> 2) + 1 \|\|// approx 25%
857	<	now - lastTrim >= UNUSED_SPARE_TRIM_RATE_NANOS) &&
858	<	(id = ((sw = spareWaiters) & SPARE_ID_MASK) - 1) >= 0 &&
859	<	id < (ws = workers).length && (w = ws[id]) != null &&
860	<	UNSAFE.compareAndSwapInt(this, spareWaitersOffset,
861	<	sw, w.nextSpare))
862	<	w.shutdown(false);
854	>	int wc, rs, tc;
855	>	while (((wc = workerCounts) & RUNNING_COUNT_MASK) < pc &&
856	>	(rs = runState) < TERMINATING) {
857	>	if (spareWaiters != 0)
858	>	tryResumeSpare();
859	>	else if ((tc = wc >>> TOTAL_COUNT_SHIFT) >= MAX_WORKERS \|\|
860	>	(tc >= pc && (rs & ACTIVE_COUNT_MASK) != tc))
861	>	break; // enough total
862	>	else if (runState == rs && workerCounts == wc &&
863	>	UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
864	>	wc + (ONE_RUNNING\|ONE_TOTAL))) {
865	>	ForkJoinWorkerThread w = null;
866	>	try {
867	>	w = factory.newThread(this);
868	>	} finally { // adjust on null or exceptional factory return
869	>	if (w == null) {
870	>	decrementWorkerCounts(ONE_RUNNING, ONE_TOTAL);
871	>	tryTerminate(false); // handle failure during shutdown
872	>	}
873	>	}
874	>	if (w == null)
875	>	break;
876	>	w.start(recordWorker(w), ueh);
877	>	if ((workerCounts >>> TOTAL_COUNT_SHIFT) >= pc) {
878	>	int c; // advance event count
879	>	UNSAFE.compareAndSwapInt(this, eventCountOffset,
880	>	c = eventCount, c+1);
881	>	break; // add at most one unless total below target
882	>	}
883	>	}
884	>	}
885	>	if (eventWaiters != 0L)
886	>	releaseEventWaiters();
887		}
888
889		/**
890	<	* Does at most one of:
891	<	*
892	<	* 1. Help wake up existing workers waiting for work via
893	<	* releaseEventWaiters. (If any exist, then it probably doesn't
894	<	* matter right now if under target parallelism level.)
895	<	*
896	<	* 2. If below parallelism level and a spare exists, try (once)
883	<	* to resume it via tryResumeSpare.
890	>	* Callback from the oldest waiter in awaitEvent waking up after a
891	>	* period of non-use. If all workers are idle, tries (once) to
892	>	* shutdown an event waiter or a spare, if one exists. Note that
893	>	* we don't need CAS or locks here because the method is called
894	>	* only from one thread occasionally waking (and even misfires are
895	>	* OK). Note that until the shutdown worker fully terminates,
896	>	* workerCounts will overestimate total count, which is tolerable.
897		*
898	<	* 3. If neither of the above, tries (once) to add a new
899	<	* worker if either there are not enough total, or if all
887	<	* existing workers are busy, there are either no running
888	<	* workers or the deficit is at least twice the surplus.
898	>	* @param ec the event count waited on by caller (to abort
899	>	* attempt if count has since changed).
900		*/
901	<	private void helpMaintainParallelism() {
902	<	// uglified to work better when not compiled
903	<	int pc, wc, rc, tc, rs; long h;
904	<	if ((h = eventWaiters) != 0L) {
905	<	if ((int)(h >>> EVENT_COUNT_SHIFT) != eventCount)
906	<	releaseEventWaiters(false); // avoid useless call
907	<	}
908	<	else if ((pc = parallelism) >
909	<	(rc = ((wc = workerCounts) & RUNNING_COUNT_MASK))) {
910	<	if (spareWaiters != 0)
911	<	tryResumeSpare();
912	<	else if ((rs = runState) < TERMINATING &&
913	<	((tc = wc >>> TOTAL_COUNT_SHIFT) < pc \|\|
914	<	(tc == (rs & ACTIVE_COUNT_MASK) && // all busy
915	<	(rc == 0 \|\| // must add
916	<	rc < pc - ((tc - pc) << 1)) && // within slack
917	<	tc < MAX_WORKERS && runState == rs)) && // recheck busy
918	<	workerCounts == wc &&
919	<	UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
920	<	wc + (ONE_RUNNING\|ONE_TOTAL)))
921	<	addWorker();
901	>	private void tryShutdownUnusedWorker(int ec) {
902	>	if (runState == 0 && eventCount == ec) { // only trigger if all idle
903	>	ForkJoinWorkerThread[] ws = workers;
904	>	int n = ws.length;
905	>	ForkJoinWorkerThread w = null;
906	>	boolean shutdown = false;
907	>	int sw;
908	>	long h;
909	>	if ((sw = spareWaiters) != 0) { // prefer killing spares
910	>	int id = (sw & SPARE_ID_MASK) - 1;
911	>	if (id >= 0 && id < n && (w = ws[id]) != null &&
912	>	UNSAFE.compareAndSwapInt(this, spareWaitersOffset,
913	>	sw, w.nextSpare))
914	>	shutdown = true;
915	>	}
916	>	else if ((h = eventWaiters) != 0L) {
917	>	long nh;
918	>	int id = ((int)(h & WAITER_ID_MASK)) - 1;
919	>	if (id >= 0 && id < n && (w = ws[id]) != null &&
920	>	(nh = w.nextWaiter) != 0L && // keep at least one worker
921	>	UNSAFE.compareAndSwapLong(this, eventWaitersOffset, h, nh))
922	>	shutdown = true;
923	>	}
924	>	if (w != null && shutdown) {
925	>	w.shutdown();
926	>	LockSupport.unpark(w);
927	>	}
928		}
929	+	releaseEventWaiters(); // in case of interference
930		}
931
932		/**
933		* Callback from workers invoked upon each top-level action (i.e.,
934	<	* stealing a task or taking a submission and running
935	<	* it). Performs one or more of the following:
918	<	*
919	<	* 1. If the worker cannot find work (misses > 0), updates its
920	<	* active status to inactive and updates activeCount unless
921	<	* this is the first miss and there is contention, in which
922	<	* case it may try again (either in this or a subsequent
923	<	* call).
924	<	*
925	<	* 2. If there are at least 2 misses, awaits the next task event
926	<	* via eventSync
927	<	*
928	<	* 3. If there are too many running threads, suspends this worker
929	<	* (first forcing inactivation if necessary). If it is not
930	<	* needed, it may be killed while suspended via
931	<	* tryTrimSpare. Otherwise, upon resume it rechecks to make
932	<	* sure that it is still needed.
934	>	* stealing a task or taking a submission and running it).
935	>	* Performs one or more of the following:
936		*
937	<	* 4. Helps release and/or reactivate other workers via
938	<	* helpMaintainParallelism
937	>	* 1. If the worker is active and either did not run a task
938	>	* or there are too many workers, try to set its active status
939	>	* to inactive and update activeCount. On contention, we may
940	>	* try again in this or a subsequent call.
941	>	*
942	>	* 2. If not enough total workers, help create some.
943	>	*
944	>	* 3. If there are too many running workers, suspend this worker
945	>	* (first forcing inactive if necessary). If it is not needed,
946	>	* it may be shutdown while suspended (via
947	>	* tryShutdownUnusedWorker). Otherwise, upon resume it
948	>	* rechecks running thread count and need for event sync.
949	>	*
950	>	* 4. If worker did not run a task, await the next task event via
951	>	* eventSync if necessary (first forcing inactivation), upon
952	>	* which the worker may be shutdown via
953	>	* tryShutdownUnusedWorker. Otherwise, help release any
954	>	* existing event waiters that are now releasable,
955		*
956		* @param w the worker
957	<	* @param misses the number of scans by caller failing to find work
939	<	* (saturating at 2 just to avoid wraparound)
957	>	* @param ran true if worker ran a task since last call to this method
958		*/
959	<	final void preStep(ForkJoinWorkerThread w, int misses) {
959	>	final void preStep(ForkJoinWorkerThread w, boolean ran) {
960	>	int wec = w.lastEventCount;
961		boolean active = w.active;
962	+	boolean inactivate = false;
963		int pc = parallelism;
964	<	for (;;) {
964	>	int rs;
965	>	while (w.runState == 0 && (rs = runState) < TERMINATING) {
966	>	if ((inactivate \|\| (active && (rs & ACTIVE_COUNT_MASK) >= pc)) &&
967	>	UNSAFE.compareAndSwapInt(this, runStateOffset, rs, rs - 1))
968	>	inactivate = active = w.active = false;
969		int wc = workerCounts;
970	<	int rc = wc & RUNNING_COUNT_MASK;
971	<	if (active && (misses > 0 \|\| rc > pc)) {
972	<	int rs; // try inactivate
949	<	if (UNSAFE.compareAndSwapInt(this, runStateOffset,
950	<	rs = runState, rs - ONE_ACTIVE))
951	<	active = w.active = false;
952	<	else if (misses > 1 \|\| rc > pc \|\|
953	<	(rs & ACTIVE_COUNT_MASK) >= pc)
954	<	continue; // force inactivate
955	<	}
956	<	if (misses > 1) {
957	<	misses = 0; // don't re-sync
958	<	eventSync(w); // continue loop to recheck rc
959	<	}
960	<	else if (rc > pc) {
961	<	if (workerCounts == wc && // try to suspend as spare
970	>	if ((wc & RUNNING_COUNT_MASK) > pc) {
971	>	if (!(inactivate \|= active) && // must inactivate to suspend
972	>	workerCounts == wc && // try to suspend as spare
973		UNSAFE.compareAndSwapInt(this, workerCountsOffset,
974	<	wc, wc - ONE_RUNNING) &&
975	<	!w.suspendAsSpare()) // false if killed
974	>	wc, wc - ONE_RUNNING))
975	>	w.suspendAsSpare();
976	>	}
977	>	else if ((wc >>> TOTAL_COUNT_SHIFT) < pc)
978	>	helpMaintainParallelism(); // not enough workers
979	>	else if (!ran) {
980	>	long h = eventWaiters;
981	>	int ec = eventCount;
982	>	if (h != 0L && (int)(h >>> EVENT_COUNT_SHIFT) != ec)
983	>	releaseEventWaiters(); // release others before waiting
984	>	else if (ec != wec) {
985	>	w.lastEventCount = ec; // no need to wait
986		break;
987	+	}
988	+	else if (!(inactivate \|= active))
989	+	eventSync(w, wec); // must inactivate before sync
990		}
991	<	else {
968	<	if (rc < pc \|\| eventWaiters != 0L)
969	<	helpMaintainParallelism();
991	>	else
992		break;
971	–	}
993		}
994		}
995
996		/**
997		* Helps and/or blocks awaiting join of the given task.
998	<	* Alternates between helpJoinTask() and helpMaintainParallelism()
978	<	* as many times as there is a deficit in running count (or longer
979	<	* if running count would become zero), then blocks if task still
980	<	* not done.
998	>	* See above for explanation.
999		*
1000		* @param joinMe the task to join
1001	+	* @param worker the current worker thread
1002		*/
1003		final void awaitJoin(ForkJoinTask<?> joinMe, ForkJoinWorkerThread worker) {
1004	<	int threshold = parallelism; // descend blocking thresholds
1004	>	int retries = 2 + (parallelism >> 2); // #helpJoins before blocking
1005		while (joinMe.status >= 0) {
1006	<	boolean block; int wc;
1006	>	int wc;
1007		worker.helpJoinTask(joinMe);
1008		if (joinMe.status < 0)
1009		break;
1010	<	if (((wc = workerCounts) & RUNNING_COUNT_MASK) <= threshold) {
1011	<	if (threshold > 0)
1012	<	--threshold;
1013	<	else
1014	<	advanceEventCount(); // force release
1015	<	block = false;
1016	<	}
1017	<	else
1018	<	block = UNSAFE.compareAndSwapInt(this, workerCountsOffset,
1019	<	wc, wc - ONE_RUNNING);
1020	<	helpMaintainParallelism();
1021	<	if (block) {
1022	<	int c;
1023	<	joinMe.internalAwaitDone();
1010	>	else if (retries > 0)
1011	>	--retries;
1012	>	else if (((wc = workerCounts) & RUNNING_COUNT_MASK) != 0 &&
1013	>	UNSAFE.compareAndSwapInt(this, workerCountsOffset,
1014	>	wc, wc - ONE_RUNNING)) {
1015	>	int stat, c; long h;
1016	>	while ((stat = joinMe.status) >= 0 &&
1017	>	(h = eventWaiters) != 0L && // help release others
1018	>	(int)(h >>> EVENT_COUNT_SHIFT) != eventCount)
1019	>	releaseEventWaiters();
1020	>	if (stat >= 0 &&
1021	>	((workerCounts & RUNNING_COUNT_MASK) == 0 \|\|
1022	>	(stat =
1023	>	joinMe.internalAwaitDone(JOIN_TIMEOUT_MILLIS)) >= 0))
1024	>	helpMaintainParallelism(); // timeout or no running workers
1025		do {} while (!UNSAFE.compareAndSwapInt
1026		(this, workerCountsOffset,
1027		c = workerCounts, c + ONE_RUNNING));
1028	<	break;
1028	>	if (stat < 0)
1029	>	break; // else restart
1030		}
1031		}
1032		}
1033
1034		/**
1035	<	* Same idea as awaitJoin, but no helping
1035	>	* Same idea as awaitJoin, but no helping, retries, or timeouts.
1036		*/
1037		final void awaitBlocker(ManagedBlocker blocker)
1038		throws InterruptedException {
1018	–	int threshold = parallelism;
1039		while (!blocker.isReleasable()) {
1040	<	boolean block; int wc;
1041	<	if (((wc = workerCounts) & RUNNING_COUNT_MASK) <= threshold) {
1042	<	if (threshold > 0)
1043	<	--threshold;
1024	<	else
1025	<	advanceEventCount();
1026	<	block = false;
1027	<	}
1028	<	else
1029	<	block = UNSAFE.compareAndSwapInt(this, workerCountsOffset,
1030	<	wc, wc - ONE_RUNNING);
1031	<	helpMaintainParallelism();
1032	<	if (block) {
1040	>	int wc = workerCounts;
1041	>	if ((wc & RUNNING_COUNT_MASK) != 0 &&
1042	>	UNSAFE.compareAndSwapInt(this, workerCountsOffset,
1043	>	wc, wc - ONE_RUNNING)) {
1044		try {
1045	<	do {} while (!blocker.isReleasable() && !blocker.block());
1045	>	while (!blocker.isReleasable()) {
1046	>	long h = eventWaiters;
1047	>	if (h != 0L &&
1048	>	(int)(h >>> EVENT_COUNT_SHIFT) != eventCount)
1049	>	releaseEventWaiters();
1050	>	else if ((workerCounts & RUNNING_COUNT_MASK) == 0 &&
1051	>	runState < TERMINATING)
1052	>	helpMaintainParallelism();
1053	>	else if (blocker.block())
1054	>	break;
1055	>	}
1056		} finally {
1057		int c;
1058		do {} while (!UNSAFE.compareAndSwapInt
#	Line 1073 \| Line 1094 \| public class ForkJoinPool extends Abstra
1094		* Actions on transition to TERMINATING
1095		*
1096		* Runs up to four passes through workers: (0) shutting down each
1097	<	* quietly (without waking up if parked) to quickly spread
1098	<	* notifications without unnecessary bouncing around event queues
1099	<	* etc (1) wake up and help cancel tasks (2) interrupt (3) mop up
1100	<	* races with interrupted workers
1097	>	* (without waking up if parked) to quickly spread notifications
1098	>	* without unnecessary bouncing around event queues etc (1) wake
1099	>	* up and help cancel tasks (2) interrupt (3) mop up races with
1100	>	* interrupted workers
1101		*/
1102		private void startTerminating() {
1103		cancelSubmissions();
1104		for (int passes = 0; passes < 4 && workerCounts != 0; ++passes) {
1105	<	advanceEventCount();
1105	>	int c; // advance event count
1106	>	UNSAFE.compareAndSwapInt(this, eventCountOffset,
1107	>	c = eventCount, c+1);
1108		eventWaiters = 0L; // clobber lists
1109		spareWaiters = 0;
1110		ForkJoinWorkerThread[] ws = workers;
#	Line 1089 \| Line 1112 \| public class ForkJoinPool extends Abstra
1112		for (int i = 0; i < n; ++i) {
1113		ForkJoinWorkerThread w = ws[i];
1114		if (w != null) {
1115	<	w.shutdown(true);
1115	>	w.shutdown();
1116		if (passes > 0 && !w.isTerminated()) {
1117		w.cancelTasks();
1118		LockSupport.unpark(w);
#	Line 1237 \| Line 1260 \| public class ForkJoinPool extends Abstra
1260		this.workerLock = new ReentrantLock();
1261		this.termination = new Phaser(1);
1262		this.poolNumber = poolNumberGenerator.incrementAndGet();
1240	–	this.trimTime = System.nanoTime();
1263		}
1264
1265		/**
#	Line 1245 \| Line 1267 \| public class ForkJoinPool extends Abstra
1267		* @param pc the initial parallelism level
1268		*/
1269		private static int initialArraySizeFor(int pc) {
1270	<	// See Hackers Delight, sec 3.2. We know MAX_WORKERS < (1 >>> 16)
1270	>	// If possible, initially allocate enough space for one spare
1271		int size = pc < MAX_WORKERS ? pc + 1 : MAX_WORKERS;
1272	+	// See Hackers Delight, sec 3.2. We know MAX_WORKERS < (1 >>> 16)
1273		size \|= size >>> 1;
1274		size \|= size >>> 2;
1275		size \|= size >>> 4;
#	Line 1265 \| Line 1288 \| public class ForkJoinPool extends Abstra
1288		if (runState >= SHUTDOWN)
1289		throw new RejectedExecutionException();
1290		submissionQueue.offer(task);
1291	<	advanceEventCount();
1292	<	helpMaintainParallelism(); // start or wake up workers
1291	>	int c; // try to increment event count -- CAS failure OK
1292	>	UNSAFE.compareAndSwapInt(this, eventCountOffset, c = eventCount, c+1);
1293	>	helpMaintainParallelism(); // create, start, or resume some workers
1294		}
1295
1296		/**
1297		* Performs the given task, returning its result upon completion.
1274	–	* If the caller is already engaged in a fork/join computation in
1275	–	* the current pool, this method is equivalent in effect to
1276	–	* {@link ForkJoinTask#invoke}.
1298		*
1299		* @param task the task
1300		* @return the task's result
#	Line 1288 \| Line 1309 \| public class ForkJoinPool extends Abstra
1309
1310		/**
1311		* Arranges for (asynchronous) execution of the given task.
1291	–	* If the caller is already engaged in a fork/join computation in
1292	–	* the current pool, this method is equivalent in effect to
1293	–	* {@link ForkJoinTask#fork}.
1312		*
1313		* @param task the task
1314		* @throws NullPointerException if the task is null
#	Line 1319 \| Line 1337 \| public class ForkJoinPool extends Abstra
1337
1338		/**
1339		* Submits a ForkJoinTask for execution.
1322	–	* If the caller is already engaged in a fork/join computation in
1323	–	* the current pool, this method is equivalent in effect to
1324	–	* {@link ForkJoinTask#fork}.
1340		*
1341		* @param task the task to submit
1342		* @return the task
#	Line 1753 \| Line 1768 \| public class ForkJoinPool extends Abstra
1768		* QueueTaker(BlockingQueue<E> q) { this.queue = q; }
1769		* public boolean block() throws InterruptedException {
1770		* if (item == null)
1771	<	* item = queue.take
1771	>	* item = queue.take();
1772		* return true;
1773		* }
1774		* public boolean isReleasable() {
1775	<	* return item != null \|\| (item = queue.poll) != null;
1775	>	* return item != null \|\| (item = queue.poll()) != null;
1776		* }
1777		* public E getItem() { // call after pool.managedBlock completes
1778		* return item;

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing jsr166/src/jsr166y/ForkJoinPool.java (file contents): Revision 1.62 by dl, Wed Aug 11 20:28:22 2010 UTC vs. Revision 1.66 by dl, Sun Aug 29 23:34:46 2010 UTC

Diff Legend

Comparing jsr166/src/jsr166y/ForkJoinPool.java (file contents):
Revision 1.62 by dl, Wed Aug 11 20:28:22 2010 UTC vs.
Revision 1.66 by dl, Sun Aug 29 23:34:46 2010 UTC