[ViewVC] Diff of: jsr166/jsr166/src/jsr166y/ForkJoinWorkerThread.java

Comparing jsr166/src/jsr166y/ForkJoinWorkerThread.java (file contents):
Revision 1.27 by dl, Mon Aug 3 13:01:15 2009 UTC vs.
Revision 1.28 by dl, Mon Aug 3 13:40:07 2009 UTC

#	Line 57 \| Line 57 \| public class ForkJoinWorkerThread extend
57		* considered individually, is not wait-free. One thief cannot
58		* successfully continue until another in-progress one (or, if
59		* previously empty, a push) completes. However, in the
60	<	* aggregate, we ensure at least probabilistic non-blockingness. If
61	<	* an attempted steal fails, a thief always chooses a different
62	<	* random victim target to try next. So, in order for one thief to
63	<	* progress, it suffices for any in-progress deq or new push on
64	<	* any empty queue to complete. One reason this works well here is
65	<	* that apparently-nonempty often means soon-to-be-stealable,
66	<	* which gives threads a chance to activate if necessary before
67	<	* stealing (see below).
60	>	* aggregate, we ensure at least probabilistic
61	>	* non-blockingness. If an attempted steal fails, a thief always
62	>	* chooses a different random victim target to try next. So, in
63	>	* order for one thief to progress, it suffices for any
64	>	* in-progress deq or new push on any empty queue to complete. One
65	>	* reason this works well here is that apparently-nonempty often
66	>	* means soon-to-be-stealable, which gives threads a chance to
67	>	* activate if necessary before stealing (see below).
68		*
69		* This approach also enables support for "async mode" where local
70		* task processing is in FIFO, not LIFO order; simply by using a
#	Line 80 \| Line 80 \| public class ForkJoinWorkerThread extend
80		* protected by volatile base reads, reads of the queue array and
81		* its slots do not need volatile load semantics, but writes (in
82		* push) require store order and CASes (in pop and deq) require
83	<	* (volatile) CAS semantics. Since these combinations aren't
84	<	* supported using ordinary volatiles, the only way to accomplish
85	<	* these efficiently is to use direct Unsafe calls. (Using external
83	>	* (volatile) CAS semantics. (See "Idempotent work stealing" by
84	>	* Michael, Saraswat, and Vechev, PPoPP 2009
85	>	* http://portal.acm.org/citation.cfm?id=1504186 for an algorithm
86	>	* with similar properties, but without support for nulling
87	>	* slots.) Since these combinations aren't supported using
88	>	* ordinary volatiles, the only way to accomplish these
89	>	* efficiently is to use direct Unsafe calls. (Using external
90		* AtomicIntegers and AtomicReferenceArrays for the indices and
91		* array is significantly slower because of memory locality and
92	<	* indirection effects.) Further, performance on most platforms is
93	<	* very sensitive to placement and sizing of the (resizable) queue
94	<	* array. Even though these queues don't usually become all that
95	<	* big, the initial size must be large enough to counteract cache
92	>	* indirection effects.)
93	>	*
94	>	* Further, performance on most platforms is very sensitive to
95	>	* placement and sizing of the (resizable) queue array. Even
96	>	* though these queues don't usually become all that big, the
97	>	* initial size must be large enough to counteract cache
98		* contention effects across multiple queues (especially in the
99		* presence of GC cardmarking). Also, to improve thread-locality,
100		* queues are currently initialized immediately after the thread
#	Line 105 \| Line 111 \| public class ForkJoinWorkerThread extend
111		* counter (activeCount) held by the pool. It uses an algorithm
112		* similar to that in Herlihy and Shavit section 17.6 to cause
113		* threads to eventually block when all threads declare they are
114	<	* inactive. (See variable "scans".) For this to work, threads
115	<	* must be declared active when executing tasks, and before
116	<	* stealing a task. They must be inactive before blocking on the
117	<	* Pool Barrier (awaiting a new submission or other Pool
118	<	* event). In between, there is some free play which we take
119	<	* advantage of to avoid contention and rapid flickering of the
120	<	* global activeCount: If inactive, we activate only if a victim
121	<	* queue appears to be nonempty (see above). Similarly, a thread
122	<	* tries to inactivate only after a full scan of other threads.
123	<	* The net effect is that contention on activeCount is rarely a
124	<	* measurable performance issue. (There are also a few other cases
125	<	* where we scan for work rather than retry/block upon
120	<	* contention.)
114	>	* inactive. For this to work, threads must be declared active
115	>	* when executing tasks, and before stealing a task. They must be
116	>	* inactive before blocking on the Pool Barrier (awaiting a new
117	>	* submission or other Pool event). In between, there is some free
118	>	* play which we take advantage of to avoid contention and rapid
119	>	* flickering of the global activeCount: If inactive, we activate
120	>	* only if a victim queue appears to be nonempty (see above).
121	>	* Similarly, a thread tries to inactivate only after a full scan
122	>	* of other threads. The net effect is that contention on
123	>	* activeCount is rarely a measurable performance issue. (There
124	>	* are also a few other cases where we scan for work rather than
125	>	* retry/block upon contention.)
126		*
127		* 3. Selection control. We maintain policy of always choosing to
128		* run local tasks rather than stealing, and always trying to

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing jsr166/src/jsr166y/ForkJoinWorkerThread.java (file contents): Revision 1.27 by dl, Mon Aug 3 13:01:15 2009 UTC vs. Revision 1.28 by dl, Mon Aug 3 13:40:07 2009 UTC

Diff Legend

Comparing jsr166/src/jsr166y/ForkJoinWorkerThread.java (file contents):
Revision 1.27 by dl, Mon Aug 3 13:01:15 2009 UTC vs.
Revision 1.28 by dl, Mon Aug 3 13:40:07 2009 UTC