[ViewVC] Diff of: jsr166/jsr166/src/jsr166y/ForkJoinPool.java

Comparing jsr166/src/jsr166y/ForkJoinPool.java (file contents):
Revision 1.57 by dl, Wed Jul 7 19:52:31 2010 UTC vs.
Revision 1.58 by dl, Fri Jul 23 13:07:43 2010 UTC

#	Line 60 \| Line 60 \| import java.util.concurrent.CountDownLat
60		* Runnable}- or {@code Callable}- based activities as well. However,
61		* tasks that are already executing in a pool should normally
62		* <em>NOT</em> use these pool execution methods, but instead use the
63	<	* within-computation forms listed in the table. To avoid inadvertant
64	<	* cyclic task dependencies and to improve performance, task
65	<	* submissions to the current pool by an ongoing fork/join
66	<	* computations may be implicitly translated to the corresponding
67	<	* ForkJoinTask forms.
63	>	* within-computation forms listed in the table.
64		*
65		* <table BORDER CELLPADDING=3 CELLSPACING=1>
66		* <tr>
#	Line 113 \| Line 109 \| import java.util.concurrent.CountDownLat
109		* {@code IllegalArgumentException}.
110		*
111		* <p>This implementation rejects submitted tasks (that is, by throwing
112	<	* {@link RejectedExecutionException}) only when the pool is shut down.
112	>	* {@link RejectedExecutionException}) only when the pool is shut down
113	>	* or internal resources have been exhuasted.
114		*
115		* @since 1.7
116		* @author Doug Lea
#	Line 140 \| Line 137 \| public class ForkJoinPool extends Abstra
137		* of tasks profit from cache affinities, but others are harmed by
138		* cache pollution effects.)
139		*
140	+	* Beyond work-stealing support and essential bookkeeping, the
141	+	* main responsibility of this framework is to arrange tactics for
142	+	* when one worker is waiting to join a task stolen (or always
143	+	* held by) another. Becauae we are multiplexing many tasks on to
144	+	* a pool of workers, we can't just let them block (as in
145	+	* Thread.join). We also cannot just reassign the joiner's
146	+	* run-time stack with another and replace it later, which would
147	+	* be a form of "continuation", that even if possible is not
148	+	* necessarily a good idea. Given that the creation costs of most
149	+	* threads on most systems mainly surrounds setting up runtime
150	+	* stacks, thread creation and switching is usually not much more
151	+	* expensive than stack creation and switching, and is more
152	+	* flexible). Instead we combine two tactics:
153	+	*
154	+	* 1. Arranging for the joiner to execute some task that it
155	+	* would be running if the steal had not occurred. Method
156	+	* ForkJoinWorkerThread.helpJoinTask tracks joining->stealing
157	+	* links to try to find such a task.
158	+	*
159	+	* 2. Unless there are already enough live threads, creating or
160	+	* or re-activating a spare thread to compensate for the
161	+	* (blocked) joiner until it unblocks. Spares then suspend
162	+	* at their next opportunity or eventually die if unused for
163	+	* too long. See below and the internal documentation
164	+	* for tryAwaitJoin for more details about compensation
165	+	* rules.
166	+	*
167	+	* Because the determining existence of conservatively safe
168	+	* helping targets, the availability of already-created spares,
169	+	* and the apparent need to create new spares are all racy and
170	+	* require heuristic guidance, joins (in
171	+	* ForkJoinWorkerThread.joinTask) interleave these options until
172	+	* successful. Creating a new spare always succeeds, but also
173	+	* increases application footprint, so we try to avoid it, within
174	+	* reason.
175	+	*
176	+	* The ManagedBlocker extension API can't use option (1) so uses a
177	+	* special version of (2) in method awaitBlocker.
178	+	*
179		* The main throughput advantages of work-stealing stem from
180		* decentralized control -- workers mostly steal tasks from each
181		* other. We do not want to negate this by creating bottlenecks
182	<	* implementing the management responsibilities of this class. So
183	<	* we use a collection of techniques that avoid, reduce, or cope
184	<	* well with contention. These entail several instances of
185	<	* bit-packing into CASable fields to maintain only the minimally
186	<	* required atomicity. To enable such packing, we restrict maximum
187	<	* parallelism to (1<<15)-1 (enabling twice this to fit into a 16
188	<	* bit field), which is far in excess of normal operating range.
189	<	* Even though updates to some of these bookkeeping fields do
190	<	* sometimes contend with each other, they don't normally
191	<	* cache-contend with updates to others enough to warrant memory
192	<	* padding or isolation. So they are all held as fields of
193	<	* ForkJoinPool objects. The main capabilities are as follows:
182	>	* implementing other management responsibilities. So we use a
183	>	* collection of techniques that avoid, reduce, or cope well with
184	>	* contention. These entail several instances of bit-packing into
185	>	* CASable fields to maintain only the minimally required
186	>	* atomicity. To enable such packing, we restrict maximum
187	>	* parallelism to (1<<15)-1 (enabling twice this (to accommodate
188	>	* unbalanced increments and decrements) to fit into a 16 bit
189	>	* field, which is far in excess of normal operating range. Even
190	>	* though updates to some of these bookkeeping fields do sometimes
191	>	* contend with each other, they don't normally cache-contend with
192	>	* updates to others enough to warrant memory padding or
193	>	* isolation. So they are all held as fields of ForkJoinPool
194	>	* objects. The main capabilities are as follows:
195		*
196		* 1. Creating and removing workers. Workers are recorded in the
197		* "workers" array. This is an array as opposed to some other data
#	Line 179 \| Line 216 \| public class ForkJoinPool extends Abstra
216		* that are neither blocked nor artifically suspended) as well as
217		* the total number. These two values are packed into one field,
218		* "workerCounts" because we need accurate snapshots when deciding
219	<	* to create, resume or suspend. To support these decisions,
220	<	* updates to spare counts must be prospective (not
221	<	* retrospective). For example, the running count is decremented
222	<	* before blocking by a thread about to block as a spare, but
186	<	* incremented by the thread about to unblock it. Updates upon
187	<	* resumption ofr threads blocking in awaitJoin or awaitBlocker
188	<	* cannot usually be prospective, so the running count is in
189	<	* general an upper bound of the number of productively running
190	<	* threads Updates to the workerCounts field sometimes transiently
191	<	* encounter a fair amount of contention when join dependencies
192	<	* are such that many threads block or unblock at about the same
193	<	* time. We alleviate this by sometimes performing an alternative
194	<	* action on contention like releasing waiters or locating spares.
219	>	* to create, resume or suspend. Note however that the
220	>	* correspondance of these counts to reality is not guaranteed. In
221	>	* particular updates for unblocked threads may lag until they
222	>	* actually wake up.
223		*
224		* 3. Maintaining global run state. The run state of the pool
225		* consists of a runLevel (SHUTDOWN, TERMINATING, etc) similar to
#	Line 249 \| Line 277 \| public class ForkJoinPool extends Abstra
277		* 5. Managing suspension of extra workers. When a worker is about
278		* to block waiting for a join (or via ManagedBlockers), we may
279		* create a new thread to maintain parallelism level, or at least
280	<	* avoid starvation (see below). Usually, extra threads are needed
281	<	* for only very short periods, yet join dependencies are such
282	<	* that we sometimes need them in bursts. Rather than create new
283	<	* threads each time this happens, we suspend no-longer-needed
284	<	* extra ones as "spares". For most purposes, we don't distinguish
285	<	* "extra" spare threads from normal "core" threads: On each call
286	<	* to preStep (the only point at which we can do this) a worker
280	>	* avoid starvation. Usually, extra threads are needed for only
281	>	* very short periods, yet join dependencies are such that we
282	>	* sometimes need them in bursts. Rather than create new threads
283	>	* each time this happens, we suspend no-longer-needed extra ones
284	>	* as "spares". For most purposes, we don't distinguish "extra"
285	>	* spare threads from normal "core" threads: On each call to
286	>	* preStep (the only point at which we can do this) a worker
287		* checks to see if there are now too many running workers, and if
288	<	* so, suspends itself. Methods awaitJoin and awaitBlocker look
289	<	* for suspended threads to resume before considering creating a
290	<	* new replacement. We don't need a special data structure to
291	<	* maintain spares; simply scanning the workers array looking for
292	<	* worker.isSuspended() is fine because the calling thread is
293	<	* otherwise not doing anything useful anyway; we are at least as
294	<	* happy if after locating a spare, the caller doesn't actually
295	<	* block because the join is ready before we try to adjust and
296	<	* compensate. Note that this is intrinsically racy. One thread
297	<	* may become a spare at about the same time as another is
298	<	* needlessly being created. We counteract this and related slop
299	<	* in part by requiring resumed spares to immediately recheck (in
300	<	* preStep) to see whether they they should re-suspend. The only
301	<	* effective difference between "extra" and "core" threads is that
302	<	* we allow the "extra" ones to time out and die if they are not
303	<	* resumed within a keep-alive interval of a few seconds. This is
304	<	* implemented mainly within ForkJoinWorkerThread, but requires
288	>	* so, suspends itself. Methods tryAwaitJoin and awaitBlocker
289	>	* look for suspended threads to resume before considering
290	>	* creating a new replacement. We don't need a special data
291	>	* structure to maintain spares; simply scanning the workers array
292	>	* looking for worker.isSuspended() is fine because the calling
293	>	* thread is otherwise not doing anything useful anyway; we are at
294	>	* least as happy if after locating a spare, the caller doesn't
295	>	* actually block because the join is ready before we try to
296	>	* adjust and compensate. Note that this is intrinsically racy.
297	>	* One thread may become a spare at about the same time as another
298	>	* is needlessly being created. We counteract this and related
299	>	* slop in part by requiring resumed spares to immediately recheck
300	>	* (in preStep) to see whether they they should re-suspend. The
301	>	* only effective difference between "extra" and "core" threads is
302	>	* that we allow the "extra" ones to time out and die if they are
303	>	* not resumed within a keep-alive interval of a few seconds. This
304	>	* is implemented mainly within ForkJoinWorkerThread, but requires
305		* some coordination (isTrimmed() -- meaning killed while
306		* suspended) to correctly maintain pool counts.
307		*
308		* 6. Deciding when to create new workers. The main dynamic
309		* control in this class is deciding when to create extra threads,
310		* in methods awaitJoin and awaitBlocker. We always need to create
311	<	* one when the number of running threads becomes zero. But
312	<	* because blocked joins are typically dependent, we don't
313	<	* necessarily need or want one-to-one replacement. Instead, we
314	<	* use a combination of heuristics that adds threads only when the
315	<	* pool appears to be approaching starvation. These effectively
316	<	* reduce churn at the price of systematically undershooting
317	<	* target parallelism when many threads are blocked. However,
318	<	* biasing toward undeshooting partially compensates for the above
319	<	* mechanics to suspend extra threads, that normally lead to
320	<	* overshoot because we can only suspend workers in-between
321	<	* top-level actions. It also better copes with the fact that some
322	<	* of the methods in this class tend to never become compiled (but
323	<	* are interpreted), so some components of the entire set of
324	<	* controls might execute many times faster than others. And
325	<	* similarly for cases where the apparent lack of work is just due
298	<	* to GC stalls and other transient system activity.
311	>	* one when the number of running threads would become zero and
312	>	* all workers are busy. However, this is not easy to detect
313	>	* reliably in the presence of transients so we use retries and
314	>	* allow slack (in tryAwaitJoin) to reduce false alarms. These
315	>	* effectively reduce churn at the price of systematically
316	>	* undershooting target parallelism when many threads are blocked.
317	>	* However, biasing toward undeshooting partially compensates for
318	>	* the above mechanics to suspend extra threads, that normally
319	>	* lead to overshoot because we can only suspend workers
320	>	* in-between top-level actions. It also better copes with the
321	>	* fact that some of the methods in this class tend to never
322	>	* become compiled (but are interpreted), so some components of
323	>	* the entire set of controls might execute many times faster than
324	>	* others. And similarly for cases where the apparent lack of work
325	>	* is just due to GC stalls and other transient system activity.
326		*
327		* Beware that there is a lot of representation-level coupling
328		* among classes ForkJoinPool, ForkJoinWorkerThread, and
#	Line 310 \| Line 337 \| public class ForkJoinPool extends Abstra
337		* "while ((local = field) != 0)") which are usually the simplest
338		* way to ensure read orderings. Also several occurrences of the
339		* unusual "do {} while(!cas...)" which is the simplest way to
340	<	* force an update of a CAS'ed variable. There are also a few
341	<	* other coding oddities that help some methods perform reasonably
342	<	* even when interpreted (not compiled).
340	>	* force an update of a CAS'ed variable. There are also other
341	>	* coding oddities that help some methods perform reasonably even
342	>	* when interpreted (not compiled), at the expense of messiness.
343		*
344		* The order of declarations in this file is: (1) statics (2)
345		* fields (along with constants used when unpacking some of them)
#	Line 431 \| Line 458 \| public class ForkJoinPool extends Abstra
458		private volatile long eventWaiters;
459
460		private static final int EVENT_COUNT_SHIFT = 32;
461	<	private static final long WAITER_INDEX_MASK = (1L << EVENT_COUNT_SHIFT)-1L;
461	>	private static final long WAITER_ID_MASK = (1L << EVENT_COUNT_SHIFT)-1L;
462
463		/**
464		* A counter for events that may wake up worker threads:
#	Line 503 \| Line 530 \| public class ForkJoinPool extends Abstra
530		*/
531		private final int poolNumber;
532
533	<	// utilities for updating fields
533	>	// Utilities for CASing fields. Note that several of these
534	>	// are manually inlined by callers
535
536		/**
537		* Increments running count. Also used by ForkJoinTask.
#	Line 514 \| Line 542 \| public class ForkJoinPool extends Abstra
542		c = workerCounts,
543		c + ONE_RUNNING));
544		}
545	<
545	>
546		/**
547		* Tries to decrement running count unless already zero
548		*/
#	Line 527 \| Line 555 \| public class ForkJoinPool extends Abstra
555		}
556
557		/**
558	+	* Tries to increment running count
559	+	*/
560	+	final boolean tryIncrementRunningCount() {
561	+	int wc;
562	+	return UNSAFE.compareAndSwapInt(this, workerCountsOffset,
563	+	wc = workerCounts, wc + ONE_RUNNING);
564	+	}
565	+
566	+	/**
567		* Tries incrementing active count; fails on contention.
568		* Called by workers before executing tasks.
569		*
#	Line 635 \| Line 672 \| public class ForkJoinPool extends Abstra
672		private void onWorkerCreationFailure() {
673		for (;;) {
674		int wc = workerCounts;
675	<	if ((wc >>> TOTAL_COUNT_SHIFT) > 0 &&
676	<	UNSAFE.compareAndSwapInt(this, workerCountsOffset,
677	<	wc, wc - (ONE_RUNNING\|ONE_TOTAL)))
675	>	if ((wc >>> TOTAL_COUNT_SHIFT) == 0)
676	>	Thread.yield(); // wait for other counts to settle
677	>	else if (UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
678	>	wc - (ONE_RUNNING\|ONE_TOTAL)))
679		break;
680		}
681		tryTerminate(false); // in case of failure during shutdown
682		}
683
684		/**
685	<	* Create enough total workers to establish target parallelism,
686	<	* giving up if terminating or addWorker fails
685	>	* Creates and/or resumes enough workers to establish target
686	>	* parallelism, giving up if terminating or addWorker fails
687	>	*
688	>	* TODO: recast this to support lazier creation and automated
689	>	* parallelism maintenance
690		*/
691	<	private void ensureEnoughTotalWorkers() {
692	<	int wc;
693	<	while (((wc = workerCounts) >>> TOTAL_COUNT_SHIFT) < parallelism &&
694	<	runState < TERMINATING) {
695	<	if ((UNSAFE.compareAndSwapInt(this, workerCountsOffset,
696	<	wc, wc + (ONE_RUNNING\|ONE_TOTAL)) &&
697	<	addWorker() == null))
691	>	private void ensureEnoughWorkers() {
692	>	for (;;) {
693	>	int pc = parallelism;
694	>	int wc = workerCounts;
695	>	int rc = wc & RUNNING_COUNT_MASK;
696	>	int tc = wc >>> TOTAL_COUNT_SHIFT;
697	>	if (tc < pc) {
698	>	if (runState == TERMINATING \|\|
699	>	(UNSAFE.compareAndSwapInt
700	>	(this, workerCountsOffset,
701	>	wc, wc + (ONE_RUNNING\|ONE_TOTAL)) &&
702	>	addWorker() == null))
703	>	break;
704	>	}
705	>	else if (tc > pc && rc < pc &&
706	>	tc > (runState & ACTIVE_COUNT_MASK)) {
707	>	ForkJoinWorkerThread spare = null;
708	>	ForkJoinWorkerThread[] ws = workers;
709	>	int nws = ws.length;
710	>	for (int i = 0; i < nws; ++i) {
711	>	ForkJoinWorkerThread w = ws[i];
712	>	if (w != null && w.isSuspended()) {
713	>	if ((workerCounts & RUNNING_COUNT_MASK) > pc \|\|
714	>	runState == TERMINATING)
715	>	return;
716	>	if (w.tryResumeSpare())
717	>	incrementRunningCount();
718	>	break;
719	>	}
720	>	}
721	>	}
722	>	else
723		break;
724		}
725		}
#	Line 689 \| Line 755 \| public class ForkJoinPool extends Abstra
755
756		accumulateStealCount(w); // collect final count
757		if (!tryTerminate(false))
758	<	ensureEnoughTotalWorkers();
758	>	ensureEnoughWorkers();
759		}
760
761		// Waiting for and signalling events
762
763		/**
764		* Releases workers blocked on a count not equal to current count.
765	+	* @return true if any released
766		*/
767		private void releaseWaiters() {
768		long top;
769	<	int id;
703	<	while ((id = (int)((top = eventWaiters) & WAITER_INDEX_MASK)) > 0 &&
704	<	(int)(top >>> EVENT_COUNT_SHIFT) != eventCount) {
769	>	while ((top = eventWaiters) != 0L) {
770		ForkJoinWorkerThread[] ws = workers;
771	<	ForkJoinWorkerThread w;
772	<	if (ws.length >= id && (w = ws[id - 1]) != null &&
773	<	UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
774	<	top, w.nextWaiter))
775	<	LockSupport.unpark(w);
771	>	int n = ws.length;
772	>	for (;;) {
773	>	int i = ((int)(top & WAITER_ID_MASK)) - 1;
774	>	if (i < 0 \|\| (int)(top >>> EVENT_COUNT_SHIFT) == eventCount)
775	>	return;
776	>	ForkJoinWorkerThread w;
777	>	if (i < n && (w = ws[i]) != null &&
778	>	UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
779	>	top, w.nextWaiter)) {
780	>	LockSupport.unpark(w);
781	>	top = eventWaiters;
782	>	}
783	>	else
784	>	break; // possibly stale; reread
785	>	}
786		}
787		}
788
#	Line 727 \| Line 802 \| public class ForkJoinPool extends Abstra
802		* other releasing threads is detected.
803		*/
804		final void signalWork() {
805	<	// EventCount CAS failures are OK -- any change in count suffices.
806	<	int ec;
807	<	UNSAFE.compareAndSwapInt(this, eventCountOffset, ec=eventCount, ec+1);
808	<	outer:for (;;) {
809	<	long top = eventWaiters;
810	<	ec = eventCount;
805	>	int c;
806	>	UNSAFE.compareAndSwapInt(this, eventCountOffset, c=eventCount, c+1);
807	>	long top;
808	>	while ((top = eventWaiters) != 0L) {
809	>	int ec = eventCount;
810	>	ForkJoinWorkerThread[] ws = workers;
811	>	int n = ws.length;
812		for (;;) {
813	<	ForkJoinWorkerThread[] ws; ForkJoinWorkerThread w;
814	<	int id = (int)(top & WAITER_INDEX_MASK);
739	<	if (id <= 0 \|\| (int)(top >>> EVENT_COUNT_SHIFT) == ec)
740	<	return;
741	<	if ((ws = workers).length < id \|\| (w = ws[id - 1]) == null \|\|
742	<	!UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
743	<	top, top = w.nextWaiter))
744	<	continue outer; // possibly stale; reread
745	<	LockSupport.unpark(w);
746	<	if (top != eventWaiters) // let someone else take over
813	>	int i = ((int)(top & WAITER_ID_MASK)) - 1;
814	>	if (i < 0 \|\| (int)(top >>> EVENT_COUNT_SHIFT) == ec)
815		return;
816	+	ForkJoinWorkerThread w;
817	+	if (i < n && (w = ws[i]) != null &&
818	+	UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
819	+	top, top = w.nextWaiter)) {
820	+	LockSupport.unpark(w);
821	+	if (top != eventWaiters) // let someone else take over
822	+	return;
823	+	}
824	+	else
825	+	break; // possibly stale; reread
826		}
827		}
828		}
#	Line 755 \| Line 833 \| public class ForkJoinPool extends Abstra
833		* release others.
834		*
835		* @param w the calling worker thread
836	+	* @param retries the number of scans by caller failing to find work
837	+	* @return false if now too many threads running
838		*/
839	<	private void eventSync(ForkJoinWorkerThread w) {
840	<	if (!w.active) {
841	<	int prev = w.lastEventCount;
842	<	long nextTop = (((long)prev << EVENT_COUNT_SHIFT) \|
839	>	private boolean eventSync(ForkJoinWorkerThread w, int retries) {
840	>	int wec = w.lastEventCount;
841	>	if (retries > 1) { // can only block after 2nd miss
842	>	long nextTop = (((long)wec << EVENT_COUNT_SHIFT) \|
843		((long)(w.poolIndex + 1)));
844		long top;
845		while ((runState < SHUTDOWN \|\| !tryTerminate(false)) &&
846	<	(((int)(top = eventWaiters) & WAITER_INDEX_MASK) == 0 \|\|
847	<	(int)(top >>> EVENT_COUNT_SHIFT) == prev) &&
848	<	eventCount == prev) {
846	>	(((int)(top = eventWaiters) & WAITER_ID_MASK) == 0 \|\|
847	>	(int)(top >>> EVENT_COUNT_SHIFT) == wec) &&
848	>	eventCount == wec) {
849		if (UNSAFE.compareAndSwapLong(this, eventWaitersOffset,
850		w.nextWaiter = top, nextTop)) {
851		accumulateStealCount(w); // transfer steals while idle
852		Thread.interrupted(); // clear/ignore interrupt
853	<	while (eventCount == prev)
853	>	while (eventCount == wec)
854		w.doPark();
855		break;
856		}
857		}
858	<	w.lastEventCount = eventCount;
858	>	wec = eventCount;
859		}
860		releaseWaiters();
861	+	int wc = workerCounts;
862	+	if ((wc & RUNNING_COUNT_MASK) <= parallelism) {
863	+	w.lastEventCount = wec;
864	+	return true;
865	+	}
866	+	if (wec != w.lastEventCount) // back up if may re-wait
867	+	w.lastEventCount = wec - (wc >>> TOTAL_COUNT_SHIFT);
868	+	return false;
869		}
870
871		/**
#	Line 798 \| Line 886 \| public class ForkJoinPool extends Abstra
886		* upon resume it rechecks to make sure that it is still needed.
887		*
888		* @param w the worker
889	<	* @param worked false if the worker scanned for work but didn't
889	>	* @param retries the number of scans by caller failing to find work
890		* find any (in which case it may block waiting for work).
891		*/
892	<	final void preStep(ForkJoinWorkerThread w, boolean worked) {
892	>	final void preStep(ForkJoinWorkerThread w, int retries) {
893		boolean active = w.active;
894	<	boolean inactivate = !worked & active;
894	>	boolean inactivate = active && retries != 0;
895		for (;;) {
896	<	if (inactivate) {
897	<	int rs = runState;
898	<	if (UNSAFE.compareAndSwapInt(this, runStateOffset,
899	<	rs, rs - ONE_ACTIVE))
900	<	inactivate = active = w.active = false;
901	<	}
902	<	int wc = workerCounts;
903	<	if ((wc & RUNNING_COUNT_MASK) <= parallelism) {
816	<	if (!worked)
817	<	eventSync(w);
818	<	return;
896	>	int rs, wc;
897	>	if (inactivate &&
898	>	UNSAFE.compareAndSwapInt(this, runStateOffset,
899	>	rs = runState, rs - ONE_ACTIVE))
900	>	inactivate = active = w.active = false;
901	>	if (((wc = workerCounts) & RUNNING_COUNT_MASK) <= parallelism) {
902	>	if (active \|\| eventSync(w, retries))
903	>	break;
904		}
905	<	if (!(inactivate \|= active) && // must inactivate to suspend
905	>	else if (!(inactivate \|= active) && // must inactivate to suspend
906		UNSAFE.compareAndSwapInt(this, workerCountsOffset,
907		wc, wc - ONE_RUNNING) &&
908	<	!w.suspendAsSpare()) // false if trimmed
909	<	return;
908	>	!w.suspendAsSpare()) // false if trimmed
909	>	break;
910		}
911		}
912
913		/**
914	<	* Tries to decrement running count, and if so, possibly creates
915	<	* or resumes compensating threads before blocking on task joinMe.
916	<	* This code is sprawled out with manual inlining to evade some
917	<	* JIT oddities.
914	>	* Awaits join of the given task if enough threads, or can resume
915	>	* or create a spare. Fails (in which case the given task might
916	>	* not be done) upon contention or lack of decision about
917	>	* blocking. Returns void because caller must check
918	>	* task status on return anyway.
919	>	*
920	>	* We allow blocking if:
921	>	*
922	>	* 1. There would still be at least as many running threads as
923	>	* parallelism level if this thread blocks.
924	>	*
925	>	* 2. A spare is resumed to replace this worker. We tolerate
926	>	* slop in the decision to replace if a spare is found without
927	>	* first decrementing run count. This may release too many,
928	>	* but if so, the superfluous ones will re-suspend via
929	>	* preStep().
930	>	*
931	>	* 3. After #spares repeated checks, there are no fewer than #spare
932	>	* threads not running. We allow this slack to avoid hysteresis
933	>	* and as a hedge against lag/uncertainty of running count
934	>	* estimates when signalling or unblocking stalls.
935	>	*
936	>	* 4. All existing workers are busy (as rechecked via repeated
937	>	* retries by caller) and a new spare is created.
938	>	*
939	>	* If none of the above hold, we try to escape out by
940	>	* re-incrementing count and returning to caller, which can retry
941	>	* later.
942		*
943		* @param joinMe the task to join
944	<	* @return task status on exit
944	>	* @param retries if negative, then serve only as a precheck
945	>	* that the thread can be replaced by a spare. Otherwise,
946	>	* the number of repeated calls to this method returning busy
947	>	* @return true if the call must be retried because there
948	>	* none of the blocking checks hold
949		*/
950	<	final int tryAwaitJoin(ForkJoinTask<?> joinMe) {
951	<	int cw = workerCounts; // read now to spoil CAS if counts change as ...
952	<	releaseWaiters(); // ... a byproduct of releaseWaiters
953	<	int stat = joinMe.status;
954	<	if (stat >= 0 && // inline variant of tryDecrementRunningCount
955	<	(cw & RUNNING_COUNT_MASK) > 0 &&
956	<	UNSAFE.compareAndSwapInt(this, workerCountsOffset,
957	<	cw, cw - ONE_RUNNING)) {
958	<	int pc = parallelism;
959	<	int scans = 0; // to require confirming passes to add threads
960	<	outer: while ((workerCounts & RUNNING_COUNT_MASK) < pc) {
961	<	if ((stat = joinMe.status) < 0)
962	<	break;
963	<	ForkJoinWorkerThread spare = null;
964	<	ForkJoinWorkerThread[] ws = workers;
965	<	int nws = ws.length;
966	<	for (int i = 0; i < nws; ++i) {
967	<	ForkJoinWorkerThread w = ws[i];
968	<	if (w != null && w.isSuspended()) {
969	<	spare = w;
970	<	break;
950	>	final boolean tryAwaitJoin(ForkJoinTask<?> joinMe, int retries) {
951	>	if (joinMe.status < 0) // precheck to prime loop
952	>	return false;
953	>	int pc = parallelism;
954	>	boolean running = true; // false when running count decremented
955	>	outer:for (;;) {
956	>	int wc = workerCounts;
957	>	int rc = wc & RUNNING_COUNT_MASK;
958	>	int tc = wc >>> TOTAL_COUNT_SHIFT;
959	>	if (running) { // replace with spare or decrement count
960	>	if (rc <= pc && tc > pc &&
961	>	(retries > 0 \|\| tc > (runState & ACTIVE_COUNT_MASK))) {
962	>	ForkJoinWorkerThread[] ws = workers;
963	>	int nws = ws.length;
964	>	for (int i = 0; i < nws; ++i) { // search for spare
965	>	ForkJoinWorkerThread w = ws[i];
966	>	if (w != null) {
967	>	if (joinMe.status < 0)
968	>	return false;
969	>	if (w.isSuspended()) {
970	>	if ((workerCounts & RUNNING_COUNT_MASK)>=pc &&
971	>	w.tryResumeSpare()) {
972	>	running = false;
973	>	break outer;
974	>	}
975	>	continue outer; // rescan
976	>	}
977	>	}
978		}
979		}
980	<	if ((stat = joinMe.status) < 0) // recheck to narrow race
980	>	if (retries < 0 \|\| // < 0 means replacement check only
981	>	rc == 0 \|\| joinMe.status < 0 \|\| workerCounts != wc \|\|
982	>	!UNSAFE.compareAndSwapInt(this, workerCountsOffset,
983	>	wc, wc - ONE_RUNNING))
984	>	return false; // done or inconsistent or contended
985	>	running = false;
986	>	if (rc > pc)
987		break;
988	<	int wc = workerCounts;
989	<	int rc = wc & RUNNING_COUNT_MASK;
990	<	if (rc >= pc)
988	>	}
989	>	else { // allow blocking if enough threads
990	>	if (rc >= pc \|\| joinMe.status < 0)
991		break;
992	<	if (spare != null) {
993	<	if (spare.tryUnsuspend()) {
994	<	int c; // inline incrementRunningCount
995	<	do {} while (!UNSAFE.compareAndSwapInt
996	<	(this, workerCountsOffset,
997	<	c = workerCounts, c + ONE_RUNNING));
998	<	LockSupport.unpark(spare);
992	>	int sc = tc - pc + 1; // = spare threads, plus the one to add
993	>	if (retries > sc) {
994	>	if (rc > 0 && rc >= pc - sc) // allow slack
995	>	break;
996	>	if (tc < MAX_THREADS &&
997	>	tc == (runState & ACTIVE_COUNT_MASK) &&
998	>	workerCounts == wc &&
999	>	UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
1000	>	wc+(ONE_RUNNING\|ONE_TOTAL))) {
1001	>	addWorker();
1002		break;
874	–	}
875	–	continue;
876	–	}
877	–	int tc = wc >>> TOTAL_COUNT_SHIFT;
878	–	int sc = tc - pc;
879	–	if (rc > 0) {
880	–	int p = pc;
881	–	int s = sc;
882	–	while (s-- >= 0) { // try keeping 3/4 live
883	–	if (rc > (p -= (p >>> 2) + 1))
884	–	break outer;
1003		}
1004		}
1005	<	if (scans++ > sc && tc < MAX_THREADS &&
1006	<	UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
1007	<	wc + (ONE_RUNNING\|ONE_TOTAL))) {
1008	<	addWorker();
1009	<	break;
1005	>	if (workerCounts == wc && // back out to allow rescan
1006	>	UNSAFE.compareAndSwapInt (this, workerCountsOffset,
1007	>	wc, wc + ONE_RUNNING)) {
1008	>	releaseWaiters(); // help others progress
1009	>	return true; // let caller retry
1010		}
1011		}
894	–	if (stat >= 0)
895	–	stat = joinMe.internalAwaitDone();
896	–	int c; // inline incrementRunningCount
897	–	do {} while (!UNSAFE.compareAndSwapInt
898	–	(this, workerCountsOffset,
899	–	c = workerCounts, c + ONE_RUNNING));
1012		}
1013	<	return stat;
1013	>	// arrive here if can block
1014	>	joinMe.internalAwaitDone();
1015	>	int c; // to inline incrementRunningCount
1016	>	do {} while (!UNSAFE.compareAndSwapInt
1017	>	(this, workerCountsOffset,
1018	>	c = workerCounts, c + ONE_RUNNING));
1019	>	return false;
1020		}
1021
1022		/**
1023	<	* Same idea as (and mostly pasted from) tryAwaitJoin, but
1024	<	* self-contained
1023	>	* Same idea as (and shares many code snippets with) tryAwaitJoin,
1024	>	* but self-contained because there are no caller retries.
1025	>	* TODO: Rework to use simpler API.
1026		*/
1027		final void awaitBlocker(ManagedBlocker blocker)
1028		throws InterruptedException {
1029	<	for (;;) {
1030	<	if (blocker.isReleasable())
1031	<	return;
913	<	int cw = workerCounts;
914	<	releaseWaiters();
915	<	if ((cw & RUNNING_COUNT_MASK) > 0 &&
916	<	UNSAFE.compareAndSwapInt(this, workerCountsOffset,
917	<	cw, cw - ONE_RUNNING))
918	<	break;
919	<	}
920	<	boolean done = false;
1029	>	boolean done;
1030	>	if (done = blocker.isReleasable())
1031	>	return;
1032		int pc = parallelism;
1033	<	int scans = 0;
1034	<	outer: while ((workerCounts & RUNNING_COUNT_MASK) < pc) {
1035	<	if (done = blocker.isReleasable())
925	<	break;
926	<	ForkJoinWorkerThread spare = null;
927	<	ForkJoinWorkerThread[] ws = workers;
928	<	int nws = ws.length;
929	<	for (int i = 0; i < nws; ++i) {
930	<	ForkJoinWorkerThread w = ws[i];
931	<	if (w != null && w.isSuspended()) {
932	<	spare = w;
933	<	break;
934	<	}
935	<	}
936	<	if (done = blocker.isReleasable())
937	<	break;
1033	>	int retries = 0;
1034	>	boolean running = true; // false when running count decremented
1035	>	outer:for (;;) {
1036		int wc = workerCounts;
1037		int rc = wc & RUNNING_COUNT_MASK;
940	–	if (rc >= pc)
941	–	break;
942	–	if (spare != null) {
943	–	if (spare.tryUnsuspend()) {
944	–	int c;
945	–	do {} while (!UNSAFE.compareAndSwapInt
946	–	(this, workerCountsOffset,
947	–	c = workerCounts, c + ONE_RUNNING));
948	–	LockSupport.unpark(spare);
949	–	break;
950	–	}
951	–	continue;
952	–	}
1038		int tc = wc >>> TOTAL_COUNT_SHIFT;
1039	<	int sc = tc - pc;
1040	<	if (rc > 0) {
1041	<	int p = pc;
1042	<	int s = sc;
1043	<	while (s-- >= 0) {
1044	<	if (rc > (p -= (p >>> 2) + 1))
1045	<	break outer;
1039	>	if (running) {
1040	>	if (rc <= pc && tc > pc &&
1041	>	(retries > 0 \|\| tc > (runState & ACTIVE_COUNT_MASK))) {
1042	>	ForkJoinWorkerThread[] ws = workers;
1043	>	int nws = ws.length;
1044	>	for (int i = 0; i < nws; ++i) {
1045	>	ForkJoinWorkerThread w = ws[i];
1046	>	if (w != null) {
1047	>	if (done = blocker.isReleasable())
1048	>	return;
1049	>	if (w.isSuspended()) {
1050	>	if ((workerCounts & RUNNING_COUNT_MASK)>=pc &&
1051	>	w.tryResumeSpare()) {
1052	>	running = false;
1053	>	break outer;
1054	>	}
1055	>	continue outer; // rescan
1056	>	}
1057	>	}
1058	>	}
1059		}
1060	+	if (done = blocker.isReleasable())
1061	+	return;
1062	+	if (rc == 0 \|\| workerCounts != wc \|\|
1063	+	!UNSAFE.compareAndSwapInt(this, workerCountsOffset,
1064	+	wc, wc - ONE_RUNNING))
1065	+	continue;
1066	+	running = false;
1067	+	if (rc > pc)
1068	+	break;
1069		}
1070	<	if (scans++ > sc && tc < MAX_THREADS &&
1071	<	UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
1072	<	wc + (ONE_RUNNING\|ONE_TOTAL))) {
1073	<	addWorker();
1074	<	break;
1070	>	else {
1071	>	if (rc >= pc \|\| (done = blocker.isReleasable()))
1072	>	break;
1073	>	int sc = tc - pc + 1;
1074	>	if (retries++ > sc) {
1075	>	if (rc > 0 && rc >= pc - sc)
1076	>	break;
1077	>	if (tc < MAX_THREADS &&
1078	>	tc == (runState & ACTIVE_COUNT_MASK) &&
1079	>	workerCounts == wc &&
1080	>	UNSAFE.compareAndSwapInt(this, workerCountsOffset, wc,
1081	>	wc+(ONE_RUNNING\|ONE_TOTAL))) {
1082	>	addWorker();
1083	>	break;
1084	>	}
1085	>	}
1086	>	Thread.yield();
1087		}
1088		}
1089	+
1090		try {
1091		if (!done)
1092	<	do {} while (!blocker.isReleasable() &&
973	<	!blocker.block());
1092	>	do {} while (!blocker.isReleasable() && !blocker.block());
1093		} finally {
1094	<	int c;
1095	<	do {} while (!UNSAFE.compareAndSwapInt
1096	<	(this, workerCountsOffset,
1097	<	c = workerCounts, c + ONE_RUNNING));
1094	>	if (!running) {
1095	>	int c;
1096	>	do {} while (!UNSAFE.compareAndSwapInt
1097	>	(this, workerCountsOffset,
1098	>	c = workerCounts, c + ONE_RUNNING));
1099	>	}
1100		}
1101		}
1102
#	Line 1103 \| Line 1224 \| public class ForkJoinPool extends Abstra
1224		* active thread.
1225		*/
1226		final int idlePerActive() {
1227	<	int pc = parallelism; // use targeted parallelism, not rc
1227	>	int pc = parallelism; // use parallelism, not rc
1228		int ac = runState; // no mask -- artifically boosts during shutdown
1229		// Use exact results for small values, saturate past 4
1230		return pc <= ac? 0 : pc >>> 1 <= ac? 1 : pc >>> 2 <= ac? 3 : pc >>> 3;
#	Line 1216 \| Line 1337 \| public class ForkJoinPool extends Abstra
1337		throw new NullPointerException();
1338		if (runState >= SHUTDOWN)
1339		throw new RejectedExecutionException();
1340	<	// Convert submissions to current pool into forks
1341	<	Thread t = Thread.currentThread();
1342	<	ForkJoinWorkerThread w;
1222	<	if ((t instanceof ForkJoinWorkerThread) &&
1223	<	(w = (ForkJoinWorkerThread) t).pool == this)
1224	<	w.pushTask(task);
1225	<	else {
1226	<	submissionQueue.offer(task);
1227	<	signalEvent();
1228	<	ensureEnoughTotalWorkers();
1229	<	}
1340	>	submissionQueue.offer(task);
1341	>	signalEvent();
1342	>	ensureEnoughWorkers();
1343		}
1344
1345		/**

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing jsr166/src/jsr166y/ForkJoinPool.java (file contents): Revision 1.57 by dl, Wed Jul 7 19:52:31 2010 UTC vs. Revision 1.58 by dl, Fri Jul 23 13:07:43 2010 UTC

Diff Legend

Comparing jsr166/src/jsr166y/ForkJoinPool.java (file contents):
Revision 1.57 by dl, Wed Jul 7 19:52:31 2010 UTC vs.
Revision 1.58 by dl, Fri Jul 23 13:07:43 2010 UTC