105 |
|
* successful atomic operation per enq/deq pair. But it also |
106 |
|
* enables lower cost variants of queue maintenance mechanics. (A |
107 |
|
* variation of this idea applies even for non-dual queues that |
108 |
< |
* support deletion of embedded elements, such as |
108 |
> |
* support deletion of interior elements, such as |
109 |
|
* j.u.c.ConcurrentLinkedQueue.) |
110 |
|
* |
111 |
< |
* Once a node is matched, its item can never again change. We |
112 |
< |
* may thus arrange that the linked list of them contains a prefix |
113 |
< |
* of zero or more matched nodes, followed by a suffix of zero or |
114 |
< |
* more unmatched nodes. (Note that we allow both the prefix and |
115 |
< |
* suffix to be zero length, which in turn means that we do not |
116 |
< |
* use a dummy header.) If we were not concerned with either time |
117 |
< |
* or space efficiency, we could correctly perform enqueue and |
118 |
< |
* dequeue operations by traversing from a pointer to the initial |
119 |
< |
* node; CASing the item of the first unmatched node on match and |
120 |
< |
* CASing the next field of the trailing node on appends. While |
121 |
< |
* this would be a terrible idea in itself, it does have the |
122 |
< |
* benefit of not requiring ANY atomic updates on head/tail |
123 |
< |
* fields. |
111 |
> |
* Once a node is matched, its match status can never again |
112 |
> |
* change. We may thus arrange that the linked list of them |
113 |
> |
* contain a prefix of zero or more matched nodes, followed by a |
114 |
> |
* suffix of zero or more unmatched nodes. (Note that we allow |
115 |
> |
* both the prefix and suffix to be zero length, which in turn |
116 |
> |
* means that we do not use a dummy header.) If we were not |
117 |
> |
* concerned with either time or space efficiency, we could |
118 |
> |
* correctly perform enqueue and dequeue operations by traversing |
119 |
> |
* from a pointer to the initial node; CASing the item of the |
120 |
> |
* first unmatched node on match and CASing the next field of the |
121 |
> |
* trailing node on appends. (Plus some special-casing when |
122 |
> |
* initially empty). While this would be a terrible idea in |
123 |
> |
* itself, it does have the benefit of not requiring ANY atomic |
124 |
> |
* updates on head/tail fields. |
125 |
|
* |
126 |
|
* We introduce here an approach that lies between the extremes of |
127 |
< |
* never versus always updating queue (head and tail) pointers |
128 |
< |
* that reflects the tradeoff of sometimes requiring extra traversal |
129 |
< |
* steps to locate the first and/or last unmatched nodes, versus |
130 |
< |
* the reduced overhead and contention of fewer updates to queue |
131 |
< |
* pointers. For example, a possible snapshot of a queue is: |
127 |
> |
* never versus always updating queue (head and tail) pointers. |
128 |
> |
* This offers a tradeoff between sometimes requiring extra |
129 |
> |
* traversal steps to locate the first and/or last unmatched |
130 |
> |
* nodes, versus the reduced overhead and contention of fewer |
131 |
> |
* updates to queue pointers. For example, a possible snapshot of |
132 |
> |
* a queue is: |
133 |
|
* |
134 |
|
* head tail |
135 |
|
* | | |
141 |
|
* similarly for "tail") is an empirical matter. We have found |
142 |
|
* that using very small constants in the range of 1-3 work best |
143 |
|
* over a range of platforms. Larger values introduce increasing |
144 |
< |
* costs of cache misses and risks of long traversal chains. |
144 |
> |
* costs of cache misses and risks of long traversal chains, while |
145 |
> |
* smaller values increase CAS contentiona and overhead. |
146 |
|
* |
147 |
|
* Dual queues with slack differ from plain M&S dual queues by |
148 |
|
* virtue of only sometimes updating head or tail pointers when |
206 |
|
* additional GC bookkeeping ("write barriers") that are sometimes |
207 |
|
* more costly than the writes themselves because of contention). |
208 |
|
* |
209 |
< |
* Removal of internal nodes (due to timed out or interrupted |
209 |
> |
* Removal of interior nodes (due to timed out or interrupted |
210 |
|
* waits, or calls to remove or Iterator.remove) uses a scheme |
211 |
< |
* roughly similar to that in Scherer, Lea, and Scott |
211 |
> |
* roughly similar to that in Scherer, Lea, and Scott's |
212 |
|
* SynchronousQueue. Given a predecessor, we can unsplice any node |
213 |
|
* except the (actual) tail of the queue. To avoid build-up of |
214 |
|
* cancelled trailing nodes, upon a request to remove a trailing |
215 |
< |
* node, it is placed in field "cleanMe" to be unspliced later. |
215 |
> |
* node, it is placed in field "cleanMe" to be unspliced upon the |
216 |
> |
* next call to unsplice any other node. Situations needing such |
217 |
> |
* mechanics are not common but do occur in practice; for example |
218 |
> |
* when an unbounded series of short timed calls to poll |
219 |
> |
* repeatedly time out but never otherwise fall off the list |
220 |
> |
* because of an untimed call to take at the front of the |
221 |
> |
* queue. (Note that maintaining field cleanMe does not otherwise |
222 |
> |
* much impact garbage retention even if never cleared by some |
223 |
> |
* other call because the held node will eventually either |
224 |
> |
* directly or indirectly lead to a self-link once off the list.) |
225 |
|
* |
226 |
|
* *** Overview of implementation *** |
227 |
|
* |
229 |
|
* slack of two. The slack value is hard-wired: a path greater |
230 |
|
* than one is naturally implemented by checking equality of |
231 |
|
* traversal pointers except when the list has only one element, |
232 |
< |
* in which case we keep max slack at one. Avoiding tracking |
233 |
< |
* explicit counts across situations slightly simplifies an |
232 |
> |
* in which case we keep target slack at one. Avoiding tracking |
233 |
> |
* explicit counts across method calls slightly simplifies an |
234 |
|
* already-messy implementation. Using randomization would |
235 |
|
* probably work better if there were a low-quality dirt-cheap |
236 |
|
* per-thread one available, but even ThreadLocalRandom is too |
237 |
|
* heavy for these purposes. |
238 |
|
* |
239 |
< |
* With such a small slack value, path short-circuiting is rarely |
240 |
< |
* worthwhile. However, it is used (in awaitMatch) immediately |
241 |
< |
* before a waiting thread starts to block, as a final bit of |
242 |
< |
* helping at a point when contention with others is extremely |
243 |
< |
* unlikely (since if other threads that could release it are |
244 |
< |
* operating, then the current thread wouldn't be blocking). |
239 |
> |
* With such a small target slack value, it is rarely worthwhile |
240 |
> |
* to augment this with path short-circuiting; i.e., unsplicing |
241 |
> |
* nodes between head and the first unmatched node, or similarly |
242 |
> |
* for tail, rather than advancing head or tail proper. However, |
243 |
> |
* it is used (in awaitMatch) immediately before a waiting thread |
244 |
> |
* starts to block, as a final bit of helping at a point when |
245 |
> |
* contention with others is extremely unlikely (since if other |
246 |
> |
* threads that could release it are operating, then the current |
247 |
> |
* thread wouldn't be blocking). |
248 |
> |
* |
249 |
> |
* We allow both the head and tail fields to be null before any |
250 |
> |
* nodes are enqueued; initializing upon first append. This |
251 |
> |
* simplifies some other logic, as well as providing more |
252 |
> |
* efficient explicit control paths instead of letting JVMs insert |
253 |
> |
* implicit NullPointerExceptions when they are null. While not |
254 |
> |
* currently fully implemented, we also leave open the possibility |
255 |
> |
* of re-nulling these fields when empty (which is is complicated |
256 |
> |
* to arrange, for little benefit.) |
257 |
|
* |
258 |
|
* All enqueue/dequeue operations are handled by the single method |
259 |
|
* "xfer" with parameters indicating whether to act as some form |
273 |
|
* case matching it and returning, also if necessary updating |
274 |
|
* head to one past the matched node (or the node itself if the |
275 |
|
* list has no other unmatched nodes). If the CAS misses, then |
276 |
< |
* a retry loops until the slack is at most two. Traversals |
277 |
< |
* also check if the initial head is now off-list, in which |
278 |
< |
* case they start at the new head. |
276 |
> |
* a loop retries advancing head by two steps until either |
277 |
> |
* success or the slack is at most two. By requiring that each |
278 |
> |
* attempt advances head by two (if applicable), we ensure that |
279 |
> |
* the slack does not grow without bound. Traversals also check |
280 |
> |
* if the initial head is now off-list, in which case they |
281 |
> |
* start at the new head. |
282 |
|
* |
283 |
|
* If no candidates are found and the call was untimed |
284 |
|
* poll/offer, (argument "how" is NOW) return. |
533 |
|
/** |
534 |
|
* Tries to append node s as tail. |
535 |
|
* |
509 |
– |
* @param haveData true if appending in data mode |
536 |
|
* @param s the node to append |
537 |
+ |
* @param haveData true if appending in data mode |
538 |
|
* @return null on failure due to losing race with append in |
539 |
|
* different mode, else s's predecessor, or s itself if no |
540 |
|
* predecessor |
541 |
|
*/ |
542 |
|
private Node tryAppend(Node s, boolean haveData) { |
543 |
< |
for (Node t = tail, p = t;;) { // move p to actual tail and append |
543 |
> |
for (Node t = tail, p = t;;) { // move p to last node and append |
544 |
|
Node n, u; // temps for reads of next & tail |
545 |
|
if (p == null && (p = head) == null) { |
546 |
|
if (casHead(null, s)) |
548 |
|
} |
549 |
|
else if (p.cannotPrecede(haveData)) |
550 |
|
return null; // lost race vs opposite mode |
551 |
< |
else if ((n = p.next) != null) // Not tail; keep traversing |
551 |
> |
else if ((n = p.next) != null) // not last; keep traversing |
552 |
|
p = p != t && t != (u = tail) ? (t = u) : // stale tail |
553 |
|
(p != n) ? n : null; // restart if off list |
554 |
|
else if (!p.casNext(null, s)) |
555 |
|
p = p.next; // re-read on CAS failure |
556 |
|
else { |
557 |
< |
if (p != t) { // Update if slack now >= 2 |
557 |
> |
if (p != t) { // update if slack now >= 2 |
558 |
|
while ((tail != t || !casTail(t, s)) && |
559 |
|
(t = tail) != null && |
560 |
|
(s = t.next) != null && // advance and retry |
568 |
|
/** |
569 |
|
* Spins/yields/blocks until node s is matched or caller gives up. |
570 |
|
* |
571 |
< |
* @param pred the predecessor of s or s or null if none |
571 |
> |
* @param pred the predecessor of s, or s or null if none |
572 |
|
* @param s the waiting node |
573 |
|
* @param e the comparison value for checking match |
574 |
|
* @param how either SYNC or TIMEOUT |
781 |
|
s.forgetContents(); // clear unneeded fields |
782 |
|
/* |
783 |
|
* At any given time, exactly one node on list cannot be |
784 |
< |
* deleted -- the last inserted node. To accommodate this, if |
785 |
< |
* we cannot delete s, we save its predecessor as "cleanMe", |
784 |
> |
* unlinked -- the last inserted node. To accommodate this, if |
785 |
> |
* we cannot unlink s, we save its predecessor as "cleanMe", |
786 |
|
* processing the previously saved version first. Because only |
787 |
|
* one node in the list can have a null next, at least one of |
788 |
|
* node s or the node previously saved can always be |