[ViewVC] Diff of: jsr166/jsr166/src/jsr166e/StripedAdder.java

Comparing jsr166/src/jsr166e/StripedAdder.java (file contents):
Revision 1.2 by jsr166, Wed Jul 20 16:06:19 2011 UTC vs.
Revision 1.3 by dl, Fri Jul 22 13:25:12 2011 UTC

#	Line 16 \| Line 16 \| import java.io.ObjectOutputStream;
16
17		/**
18		* A set of variables that together maintain a sum. When updates
19	<	* (method {@link #add}) are contended across threads, the set of
20	<	* adders may grow to reduce contention. Method {@link #sum} returns
21	<	* the current combined total across these adders. This value is
22	<	* <em>NOT</em> an atomic snapshot (concurrent updates may occur while
23	<	* the sum is being calculated), and so cannot be used alone for
24	<	* fine-grained synchronization control.
19	>	* (method {@link #add}) are contended across threads, this set of
20	>	* adder variables may grow dynamically to reduce contention. Method
21	>	* {@link #sum} returns the current combined total across these
22	>	* adders. This value is <em>NOT</em> an atomic snapshot (concurrent
23	>	* updates may occur while the sum is being calculated), and so cannot
24	>	* be used alone for fine-grained synchronization control.
25		*
26		* <p> This class may be applicable when many threads frequently
27		* update a common sum that is used for purposes such as collecting
#	Line 31 \| Line 31 \| import java.io.ObjectOutputStream;
31		* only one thread can ever update the sum, performance may be
32		* significantly slower than just updating a local variable.
33		*
34	+	* <p>A StripedAdder may optionally be constructed with a given
35	+	* expected contention level; i.e., the number of threads that are
36	+	* expected to concurrently update the sum. Supplying an accurate
37	+	* value may improve performance by reducing the need for dynamic
38	+	* adjustment.
39	+	*
40		* @author Doug Lea
41		*/
42		public class StripedAdder implements Serializable {
43		private static final long serialVersionUID = 7249069246863182397L;
44
45		/*
46	<	* Overview: We maintain a table of AtomicLongs (padded to reduce
47	<	* false sharing). The table is indexed by per-thread hash codes
48	<	* that are initialized as random values. The table doubles in
49	<	* size upon contention (as indicated by failed CASes when
50	<	* performing add()), but is capped at the nearest power of two >=
51	<	* #cpus: At that point, contention should be infrequent if each
52	<	* thread has a unique index; so we instead adjust hash codes to
53	<	* new random values upon contention rather than expanding. A
54	<	* single spinlock is used for resizing the table as well as
46	>	* Overview: We maintain a table of Atomic long variables. The
47	>	* table is indexed by per-thread hash codes that are initialized
48	>	* to random values.
49	>	*
50	>	* The table doubles in size upon contention (as indicated by
51	>	* failed CASes when performing add()), but is capped at the
52	>	* nearest power of two >= #CPUS. This reflects the idea that,
53	>	* when there are more threads than CPUs, then if each thread were
54	>	* bound to a CPU, there would exist a perfect hash function
55	>	* mapping threads to slots that eliminates collisions. When we
56	>	* reach capacity, we search for this mapping by randomly varying
57	>	* the hash codes of colliding threads. Because search is random,
58	>	* and failures only become known via CAS failures, convergence
59	>	* will be slow, and because threads are typically not bound to
60	>	* CPUS forever, may not occur at all. However, despite these
61	>	* limitations, observed contention is typically very low in these
62	>	* cases.
63	>	*
64	>	* Table entries are of class Adder; a form of AtomicLong padded
65	>	* to reduce cache contention on most processors. Padding is
66	>	* overkill for most Atomics because they are most often
67	>	* irregularly scattered in memory and thus don't interfere much
68	>	* with each other. But Atomic objects residing in arrays will
69	>	* tend to be placed adjacent to each other, and so will most
70	>	* often share cache lines without this precaution. Except for
71	>	* slot adders[0], Adders are constructed upon first use, which
72	>	* further improves per-thread locality and helps reduce (an
73	>	* already large) footprint.
74	>	*
75	>	* A single spinlock is used for resizing the table as well as
76		* populating slots with new Adders. Upon lock contention, threads
77	<	* just try other slots rather than blocking. We guarantee that at
78	<	* least one slot exists, so retries will eventually find a
79	<	* candidate Adder.
77	>	* try other slots rather than blocking. We guarantee that at
78	>	* least one slot (0) exists, so retries will eventually find a
79	>	* candidate Adder. During these retries, there is increased
80	>	* contention and reduced locality, which is still better than
81	>	* alternatives.
82		*/
83
84		/**
#	Line 58 \| Line 87 \| public class StripedAdder implements Ser
87		static final int NCPU = Runtime.getRuntime().availableProcessors();
88
89		/**
90	<	* Version of AtomicLong padded to avoid sharing cache
62	<	* lines on most processors
90	>	* Padded version of AtomicLong
91		*/
92		static final class Adder extends AtomicLong {
93		long p0, p1, p2, p3, p4, p5, p6, p7, p8, p9, pa, pb, pc, pd;
#	Line 67 \| Line 95 \| public class StripedAdder implements Ser
95		}
96
97		/**
98	<	* Holder for the thread-local hash code.
98	>	* Holder for the thread-local hash code. The code starts off with
99	>	* a given random value, but may be set to a different
100	>	* pseudo-random value (using a cheaper but adequate xorshift
101	>	* generator) upon collisions.
102		*/
103		static final class HashCode {
104		int code;
#	Line 93 \| Line 124 \| public class StripedAdder implements Ser
124		static final ThreadHashCode threadHashCode = new ThreadHashCode();
125
126		/**
127	<	* Table of adders. Initially of size 2; grows to be at most NCPU.
127	>	* Table of adders. Minimum size 2. Size grows to be at most NCPU.
128		*/
129		private transient volatile Adder[] adders;
130
#	Line 105 \| Line 136 \| public class StripedAdder implements Ser
136		private final AtomicInteger mutex;
137
138		/**
139	<	* Marsaglia XorShift for rehashing on collisions
139	>	* Marsaglia XorShift random generator for rehashing on collisions
140		*/
141		private static int xorShift(int r) {
142		r ^= r << 13;
#	Line 114 \| Line 145 \| public class StripedAdder implements Ser
145		}
146
147		/**
148	<	* Creates a new adder with initially zero sum.
148	>	* Creates a new adder with zero sum.
149		*/
150		public StripedAdder() {
151	<	Adder[] as = new Adder[2];
151	>	this(2);
152	>	}
153	>
154	>	/**
155	>	* Creates a new adder with zero sum, and with stripes presized
156	>	* for the given expected contention level.
157	>	*
158	>	* @param expectedContention the expected number of threads that
159	>	* will concurrently update the sum.
160	>	*/
161	>	public StripedAdder(int expectedContention) {
162	>	int cap = (expectedContention < NCPU) ? expectedContention : NCPU;
163	>	int size = 2;
164	>	while (size < cap)
165	>	size <<= 1;
166	>	Adder[] as = new Adder[size];
167		as[0] = new Adder(0); // ensure at least one available adder
168		this.adders = as;
169		this.mutex = new AtomicInteger();
#	Line 161 \| Line 207 \| public class StripedAdder implements Ser
207		mutex.set(0);
208		}
209		if (created) {
210	<	hc.code = h; // Use this adder next time
210	>	hc.code = h; // Use this adder next time
211		break;
212		}
213		}

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing jsr166/src/jsr166e/StripedAdder.java (file contents): Revision 1.2 by jsr166, Wed Jul 20 16:06:19 2011 UTC vs. Revision 1.3 by dl, Fri Jul 22 13:25:12 2011 UTC

Diff Legend

Comparing jsr166/src/jsr166e/StripedAdder.java (file contents):
Revision 1.2 by jsr166, Wed Jul 20 16:06:19 2011 UTC vs.
Revision 1.3 by dl, Fri Jul 22 13:25:12 2011 UTC