ViewVC Help
View File | Revision Log | Show Annotations | Download File | Root Listing
root/jsr166/jsr166/src/main/java/util/concurrent/ForkJoinWorkerThread.java
(Generate patch)

Comparing jsr166/src/main/java/util/concurrent/ForkJoinWorkerThread.java (file contents):
Revision 1.10 by jsr166, Tue Oct 6 19:02:48 2009 UTC vs.
Revision 1.11 by dl, Sun Nov 15 12:26:48 2009 UTC

# Line 737 | Line 737 | public class ForkJoinWorkerThread extend
737       * function of number of idle workers.
738       */
739      final int getEstimatedSurplusTaskCount() {
740 <        // The halving approximates weighting idle vs non-idle workers
741 <        return (sp - base) - (pool.getIdleThreadCount() >>> 1);
740 >        /*
741 >         * The goal here is to provide a very cheap heuristic guide
742 >         * for task partitioning when programmers, frameworks, tools,
743 >         * or languages have little or no idea about task granularity.
744 >         * In essence by offering this method, we ask users only about
745 >         * tradeoffs in overhead vs expected throughput and its
746 >         * variance, rather than how finely to partition tasks.
747 >         *
748 >         * In a steady state strict (tree-structured) computation,
749 >         * each thread makes available for stealing enough tasks for
750 >         * other threads to remain active. Inductively, if all threads
751 >         * play by the same rules, each thread should make available
752 >         * only a constant number of tasks.
753 >         *
754 >         * The minimum useful constant is just 1. But using a value of
755 >         * 1 would require immediate replenishment upon each steal to
756 >         * maintain enough tasks, which is infeasible.  Further,
757 >         * partitionings/granularities of offered tasks should
758 >         * minimize steal rates, which in general means that threads
759 >         * nearer the top of computation tree should generate more
760 >         * than those nearer the bottom. In perfect steady state, each
761 >         * thread is at approximately the same level of computation
762 >         * tree. However, producing extra tasks amortizes the
763 >         * uncertainty of progress and diffusion assumptions.
764 >         *
765 >         * So, users will want to use values larger, but not much
766 >         * larger than 1 to both smooth over transient shortages and
767 >         * hedge against uneven progress; as traded off against the
768 >         * cost of extra task overhead. We leave the user to pick a
769 >         * threshold value to compare with the results of this call to
770 >         * guide decisions, but recommend values such as 3.
771 >         *
772 >         * When all threads are active, it is on average OK to
773 >         * estimate surplus strictly locally. In steady-state, if one
774 >         * thread is maintaining say 2 surplus tasks, then so are
775 >         * others. So we can just use estimated queue length (although
776 >         * note that (sp - base) can be an overestimate because of
777 >         * stealers lagging increments of base).
778 >         *
779 >         * However, this strategy alone leads to serious mis-estimates
780 >         * in some non-steady-state conditions (ramp-up, ramp-down,
781 >         * other stalls). We can detect many of these by further
782 >         * considering the number of "idle" threads, that are known to
783 >         * have zero queued tasks. A straight compensation would lead
784 >         * to weighting of the queued task estimate by a function of
785 >         * the proportion of idle threads.  However, we don't want to
786 >         * waste much calculation for the sake of weightings that only
787 >         * apply transiently, so cheapen this by (a) not bothering to
788 >         * weight at all unless there is more than one queued task (b)
789 >         * rather than compensating by a factor of (#idle/#active)
790 >         * threads, we just substract out a function of #idle that is
791 >         * a good enough approximation for conditions near the
792 >         * borderlines for threshold testing.  This errs in the
793 >         * direction of reporting more extreme lack of surplus (as in
794 >         * returning negative values) in cases where users should
795 >         * almost surely be generating tasks anyway.
796 >         */
797 >        int n = sp - base;
798 >        return n > 1? n - (pool.getIdleThreadCount() >>> 2) : n;
799      }
800  
801      /**

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines