java线程池(三)：ThreadPoolExecutor源码分析

在前面分析了Executors工厂方法类之后，我们来看看AbstractExecutorService的最主要的一种实现类，ThreadpoolExecutor。

1.类的结构及其成员变量

1.类的基本结构

ThreadPoolExecutor类是AbstractExecutorService的一个实现类。其类的主要结构如下所示：

我们可以看看这个类的注释：

/**
 * An {@link ExecutorService} that executes each submitted task using
 * one of possibly several pooled threads, normally configured
 * using {@link Executors} factory methods.
 *
 * <p>Thread pools address two different problems: they usually
 * provide improved performance when executing large numbers of
 * asynchronous tasks, due to reduced per-task invocation overhead,
 * and they provide a means of bounding and managing the resources,
 * including threads, consumed when executing a collection of tasks.
 * Each {@code ThreadPoolExecutor} also maintains some basic
 * statistics, such as the number of completed tasks.
 *
 * <p>To be useful across a wide range of contexts, this class
 * provides many adjustable parameters and extensibility
 * hooks. However, programmers are urged to use the more convenient
 * {@link Executors} factory methods {@link
 * Executors#newCachedThreadPool} (unbounded thread pool, with
 * automatic thread reclamation), {@link Executors#newFixedThreadPool}
 * (fixed size thread pool) and {@link
 * Executors#newSingleThreadExecutor} (single background thread), that
 * preconfigure settings for the most common usage
 * scenarios. Otherwise, use the following guide when manually
 * configuring and tuning this class:
 *
 * <dl>
 *
 * <dt>Core and maximum pool sizes</dt>
 *
 * <dd>A {@code ThreadPoolExecutor} will automatically adjust the
 * pool size (see {@link #getPoolSize})
 * according to the bounds set by
 * corePoolSize (see {@link #getCorePoolSize}) and
 * maximumPoolSize (see {@link #getMaximumPoolSize}).
 *
 * When a new task is submitted in method {@link #execute(Runnable)},
 * and fewer than corePoolSize threads are running, a new thread is
 * created to handle the request, even if other worker threads are
 * idle.  If there are more than corePoolSize but less than
 * maximumPoolSize threads running, a new thread will be created only
 * if the queue is full.  By setting corePoolSize and maximumPoolSize
 * the same, you create a fixed-size thread pool. By setting
 * maximumPoolSize to an essentially unbounded value such as {@code
 * Integer.MAX_VALUE}, you allow the pool to accommodate an arbitrary
 * number of concurrent tasks. Most typically, core and maximum pool
 * sizes are set only upon construction, but they may also be changed
 * dynamically using {@link #setCorePoolSize} and {@link
 * #setMaximumPoolSize}. </dd>
 *
 * <dt>On-demand construction</dt>
 *
 * <dd>By default, even core threads are initially created and
 * started only when new tasks arrive, but this can be overridden
 * dynamically using method {@link #prestartCoreThread} or {@link
 * #prestartAllCoreThreads}.  You probably want to prestart threads if
 * you construct the pool with a non-empty queue. </dd>
 *
 * <dt>Creating new threads</dt>
 *
 * <dd>New threads are created using a {@link ThreadFactory}.  If not
 * otherwise specified, a {@link Executors#defaultThreadFactory} is
 * used, that creates threads to all be in the same {@link
 * ThreadGroup} and with the same {@code NORM_PRIORITY} priority and
 * non-daemon status. By supplying a different ThreadFactory, you can
 * alter the thread's name, thread group, priority, daemon status,
 * etc. If a {@code ThreadFactory} fails to create a thread when asked
 * by returning null from {@code newThread}, the executor will
 * continue, but might not be able to execute any tasks. Threads
 * should possess the "modifyThread" {@code RuntimePermission}. If
 * worker threads or other threads using the pool do not possess this
 * permission, service may be degraded: configuration changes may not
 * take effect in a timely manner, and a shutdown pool may remain in a
 * state in which termination is possible but not completed.</dd>
 *
 * <dt>Keep-alive times</dt>
 *
 * <dd>If the pool currently has more than corePoolSize threads,
 * excess threads will be terminated if they have been idle for more
 * than the keepAliveTime (see {@link #getKeepAliveTime(TimeUnit)}).
 * This provides a means of reducing resource consumption when the
 * pool is not being actively used. If the pool becomes more active
 * later, new threads will be constructed. This parameter can also be
 * changed dynamically using method {@link #setKeepAliveTime(long,
 * TimeUnit)}.  Using a value of {@code Long.MAX_VALUE} {@link
 * TimeUnit#NANOSECONDS} effectively disables idle threads from ever
 * terminating prior to shut down. By default, the keep-alive policy
 * applies only when there are more than corePoolSize threads. But
 * method {@link #allowCoreThreadTimeOut(boolean)} can be used to
 * apply this time-out policy to core threads as well, so long as the
 * keepAliveTime value is non-zero. </dd>
 *
 * <dt>Queuing</dt>
 *
 * <dd>Any {@link BlockingQueue} may be used to transfer and hold
 * submitted tasks.  The use of this queue interacts with pool sizing:
 *
 * <ul>
 *
 * <li> If fewer than corePoolSize threads are running, the Executor
 * always prefers adding a new thread
 * rather than queuing.</li>
 *
 * <li> If corePoolSize or more threads are running, the Executor
 * always prefers queuing a request rather than adding a new
 * thread.</li>
 *
 * <li> If a request cannot be queued, a new thread is created unless
 * this would exceed maximumPoolSize, in which case, the task will be
 * rejected.</li>
 *
 * </ul>
 *
 * There are three general strategies for queuing:
 * <ol>
 *
 * <li> <em> Direct handoffs.</em> A good default choice for a work
 * queue is a {@link SynchronousQueue} that hands off tasks to threads
 * without otherwise holding them. Here, an attempt to queue a task
 * will fail if no threads are immediately available to run it, so a
 * new thread will be constructed. This policy avoids lockups when
 * handling sets of requests that might have internal dependencies.
 * Direct handoffs generally require unbounded maximumPoolSizes to
 * avoid rejection of new submitted tasks. This in turn admits the
 * possibility of unbounded thread growth when commands continue to
 * arrive on average faster than they can be processed.  </li>
 *
 * <li><em> Unbounded queues.</em> Using an unbounded queue (for
 * example a {@link LinkedBlockingQueue} without a predefined
 * capacity) will cause new tasks to wait in the queue when all
 * corePoolSize threads are busy. Thus, no more than corePoolSize
 * threads will ever be created. (And the value of the maximumPoolSize
 * therefore doesn't have any effect.)  This may be appropriate when
 * each task is completely independent of others, so tasks cannot
 * affect each others execution; for example, in a web page server.
 * While this style of queuing can be useful in smoothing out
 * transient bursts of requests, it admits the possibility of
 * unbounded work queue growth when commands continue to arrive on
 * average faster than they can be processed.  </li>
 *
 * <li><em>Bounded queues.</em> A bounded queue (for example, an
 * {@link ArrayBlockingQueue}) helps prevent resource exhaustion when
 * used with finite maximumPoolSizes, but can be more difficult to
 * tune and control.  Queue sizes and maximum pool sizes may be traded
 * off for each other: Using large queues and small pools minimizes
 * CPU usage, OS resources, and context-switching overhead, but can
 * lead to artificially low throughput.  If tasks frequently block (for
 * example if they are I/O bound), a system may be able to schedule
 * time for more threads than you otherwise allow. Use of small queues
 * generally requires larger pool sizes, which keeps CPUs busier but
 * may encounter unacceptable scheduling overhead, which also
 * decreases throughput.  </li>
 *
 * </ol>
 *
 * </dd>
 *
 * <dt>Rejected tasks</dt>
 *
 * <dd>New tasks submitted in method {@link #execute(Runnable)} will be
 * <em>rejected</em> when the Executor has been shut down, and also when
 * the Executor uses finite bounds for both maximum threads and work queue
 * capacity, and is saturated.  In either case, the {@code execute} method
 * invokes the {@link
 * RejectedExecutionHandler#rejectedExecution(Runnable, ThreadPoolExecutor)}
 * method of its {@link RejectedExecutionHandler}.  Four predefined handler
 * policies are provided:
 *
 * <ol>
 *
 * <li> In the default {@link ThreadPoolExecutor.AbortPolicy}, the
 * handler throws a runtime {@link RejectedExecutionException} upon
 * rejection. </li>
 *
 * <li> In {@link ThreadPoolExecutor.CallerRunsPolicy}, the thread
 * that invokes {@code execute} itself runs the task. This provides a
 * simple feedback control mechanism that will slow down the rate that
 * new tasks are submitted. </li>
 *
 * <li> In {@link ThreadPoolExecutor.DiscardPolicy}, a task that
 * cannot be executed is simply dropped.  </li>
 *
 * <li>In {@link ThreadPoolExecutor.DiscardOldestPolicy}, if the
 * executor is not shut down, the task at the head of the work queue
 * is dropped, and then execution is retried (which can fail again,
 * causing this to be repeated.) </li>
 *
 * </ol>
 *
 * It is possible to define and use other kinds of {@link
 * RejectedExecutionHandler} classes. Doing so requires some care
 * especially when policies are designed to work only under particular
 * capacity or queuing policies. </dd>
 *
 * <dt>Hook methods</dt>
 *
 * <dd>This class provides {@code protected} overridable
 * {@link #beforeExecute(Thread, Runnable)} and
 * {@link #afterExecute(Runnable, Throwable)} methods that are called
 * before and after execution of each task.  These can be used to
 * manipulate the execution environment; for example, reinitializing
 * ThreadLocals, gathering statistics, or adding log entries.
 * Additionally, method {@link #terminated} can be overridden to perform
 * any special processing that needs to be done once the Executor has
 * fully terminated.
 *
 * <p>If hook or callback methods throw exceptions, internal worker
 * threads may in turn fail and abruptly terminate.</dd>
 *
 * <dt>Queue maintenance</dt>
 *
 * <dd>Method {@link #getQueue()} allows access to the work queue
 * for purposes of monitoring and debugging.  Use of this method for
 * any other purpose is strongly discouraged.  Two supplied methods,
 * {@link #remove(Runnable)} and {@link #purge} are available to
 * assist in storage reclamation when large numbers of queued tasks
 * become cancelled.</dd>
 *
 * <dt>Finalization</dt>
 *
 * <dd>A pool that is no longer referenced in a program <em>AND</em>
 * has no remaining threads will be {@code shutdown} automatically. If
 * you would like to ensure that unreferenced pools are reclaimed even
 * if users forget to call {@link #shutdown}, then you must arrange
 * that unused threads eventually die, by setting appropriate
 * keep-alive times, using a lower bound of zero core threads and/or
 * setting {@link #allowCoreThreadTimeOut(boolean)}.  </dd>
 *
 * </dl>
 *
 * <p><b>Extension example</b>. Most extensions of this class
 * override one or more of the protected hook methods. For example,
 * here is a subclass that adds a simple pause/resume feature:
 *
 *  <pre> {@code
 * class PausableThreadPoolExecutor extends ThreadPoolExecutor {
 *   private boolean isPaused;
 *   private ReentrantLock pauseLock = new ReentrantLock();
 *   private Condition unpaused = pauseLock.newCondition();
 *
 *   public PausableThreadPoolExecutor(...) { super(...); }
 *
 *   protected void beforeExecute(Thread t, Runnable r) {
 *     super.beforeExecute(t, r);
 *     pauseLock.lock();
 *     try {
 *       while (isPaused) unpaused.await();
 *     } catch (InterruptedException ie) {
 *       t.interrupt();
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 *
 *   public void pause() {
 *     pauseLock.lock();
 *     try {
 *       isPaused = true;
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 *
 *   public void resume() {
 *     pauseLock.lock();
 *     try {
 *       isPaused = false;
 *       unpaused.signalAll();
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 * }}</pre>
 *
 * @since 1.5
 * @author Doug Lea
 */
public class ThreadPoolExecutor extends AbstractExecutorService {

}

其大意为：一个ExecutorService的实现类，它使用一个具有多个线程的池中的一个线程来执行提交的任务，这些线程通常使用工厂方法Executors进行配置。线程池用于解决两个不同的问题，由于减少了每个任务的调用开销，他们通常在执行大量异步任务的时候可提供改进的性能，并且他们提供了一种绑定和管理资源（包括线程）的方法。该资源在执行集合的执行时消耗任务，每个ThreadPoolExecutor还维护了一些基本的统计信息，例如已完成的任务数量。为了在广泛的上下文中有用，该类提供了许多可调整的参数和可扩展的钩子函数。但是，强烈建议程序员使用Executors工厂方法的newCachedThreadPool，该方法的线程池无边界，具有自动的线程回收。newFixedThreadPool（固定大小的线程池）和newSingleThreadExecutor（单个后台线程）。可以为最常见的使用场景预配置设置。否则，在手动配置和调整此类时，请使用以下指南：

核心和最大池大小： ThreadPoolExecutor将根据corePoolSize和maximumPoolSize设置范围自动调整池的大小。请参见getPoolSize。当在方法execute中提交新任务，并且正在运行的线程少于corePoolSize线程时，即使其他工作线程处于空闲状态，也会创建一个新的线程来处理请求。如果运行的线程数大于corePoolSize,但是小于maximumPoolSize。则仅在队列已满的时候才创建线程。通过corePoolSize和maximumPoolSize设置为相同，可以创建为固定大小的线程池。通过将maximumPoolSize设置为本质上不受限制的Integer.MAX_VALUE，则可以允许线程池容纳任意数量的并发任务，通常，核心线程和最大的池大小仅在构造的时候设置。但也可以使用setCorePoolSize和setMaximumPoolSize动态更改。

按需构建：默认情况下，甚至只有在有新任务到达的时候才开始启动核心线程，但是可以使用方法prestartCoreThread或者prestartAllCoreThreads动态的进行覆盖。如果使用非空队列构造线程池，则可能需要预启动线程。

创建新线程：使用ThreadFactory创建新线程，如果没有指定，那么采用Executors的defaultThreadFactory默认方法。该方法创建的全部线程具有相同的NORM_PRIORITY优先级和非守护线程的状态。通过提供不同的ThreadFactiry可以更改线程的名称，线程组、优先级、守护线程状态等。如果通过向newThread返回null时要求ThreadFactory创建线程失败，执行程序将继续，但可能无法执行任何任务。线程应具有modifyThread的RuntimePermission。如果使用该线程池的工作线程或者其他线程不具有此权限，则服务可能会降级，配置更改可能不会即时生效。并且关闭线程池可能保持在可能终止但是没有完成的状态。

Keep-alive时间：如果当前的池中的线程数超过corePoolSize,则多余的线程将在空闲的时间超过keepAliveTime时终止。请参考getKeepAliveTime，当不积极使用线程池时，这提供了减少资源消耗的办法，也可以使用方法setKeepAliveTime动态调整（long，TimeUnit）动态调整此参数。如果使用Long.MAX_VALUE和TimeUnit＃NANOSECONDS，则有效的使用空闲线程不会再线程池关闭之前关闭。默认情况下，仅当corePoolSize线程数多时，保持活动策略才适用。但是方法allowCoreThreadTimeOut也可以用于将此超时策略应用于核心线程，只要keepAliveTime值不为0即可。

排队：任何BlockingQueue均可用于传输和保留提交的任务，此队列的使用与线程池的大小互相影响：如果正在运行的线程少于corePoolSize线程，则执行程序总是添加新的线程来执行任务，而不是排队。如果正在运行corePoolSize或者更多的线程，则执行程序总是喜欢对请求进行排队，而不是添加新线程。如果无法将请求放入队列中，则创建一个新线程，除非该线程超过了maximumPoolSize，在这种情况下，该任务将被拒绝。

排队一般有三种策略：

直接传递：SynchronousQueue是工作队列的一个默认的选择。他可以将任务移交给线程，而不必另外保留。在这里，如果没有立即可用的线程来运行任务，则试图将任务进行排队的尝试将失败，因此需要构造一个新的线程。在处理可能具有内部依赖性的请求集的时候，此策略避免了锁。直接切换通常需要无限制的maximumPoolSizes以避免拒绝提交的新任务。反过来，当平均而言，命令继续以比其处理速度更快的到达时，这可能会带来无限线程增长的可能性。
无界队列：当所有corePoolSize都处于忙的时候，使用无届队列，如没有预定容量的LinkedBlockingQueue。将导致新任务在队列中等待。因此，仅创建corePoolSize线程。maximumPoolSize将没有任何作用。当每个任务完全独立于其他任务的时候，这可能是适当的。因此任务不会影响彼此的执行，例如，在网页服务器中。尽管这种排队方式对于消除短暂的突发请求很有用，但它承认当命令平均继续以比处理速度更快的速度到达时，工作队列会无限增长，这可能会造成OOM。
有界队列：与有限的maximumPoolSizes一起使用时，有界队列如ArrayBlockingQueue，有助于防止资源耗尽，但是调优和控制将会非常困难。队列大小和最大的线程池大小可能会互相折衷。使用大队列和小的池可以最大程度的减少CPU的使用率，操作系统和上下文切换的开销，但是会人为的导致吞吐量下降。如果任务频繁阻塞，如I/O,则系统可能认为你安排的线程调度的时间超出了允许的范围。使用小队列通常需要更大的池的大小，这会使得CPU繁忙，但是可能会遇到无法接受的调度开销，这也会降低吞吐量。

拒绝任务：当执行器关闭时，并且执行器对最大线程数和工作队列容量使用有限范围时，在方法execute提交的新任务将被拒绝。处于饱和。无论那种情况，execute将用RejectedExecutionHandler的RejectedExecutionHandler#rejectedExecution（Runnable，ThreadPoolExecutor）方法。提供了4个预订的处理策略：

默认的拒绝策略ThreadPoolExecutor.AbortPolicy，处理程序在拒绝的时候会抛出RejectedExecutionException异常。
在ThreadPoolExecutor.CallerRunsPolicy中，调用execute本身的线程运行任务。这提供了一种简单的反馈机制，将降低新任务的提交速度。
在ThreadPoolExecutor.DiscardPolicy中，简单的删除了无法执行的任务。
在ThreadPoolExecutor.DiscardOldestPolicy策略中，如果未关闭执行程序，则将丢弃工作队列开头的任务，然后重试执行，该操作可能再次失败，导致重复执行此操作。也可以定义和使用其他类型的RejectedExecutionHandler，但是这样做需要格外小心，尤其在设计策略仅在特定容量或者排队策略下才能工作时。

Hook方法：线程池类提供了protected权限的可重写的beforeExecute和afterExecute方法。这些方法在每个线程的之前前后被调用。这些可以用来对执行环境进行操作，例如，重新初始化ThreadLocals收集统计信息或者添加统计条目，此外，一旦线程完成终止，方法terminated可以被覆盖以执行需要执行的任何特殊处理。如果钩子函数或者回掉方法出现异常，内部工作的线程可能失败并突然终止。

队列维护：方法getQueue允许访问工作队列，以进行监视和调试，强烈建议不要将此方法用于任何其他目的，当取消大量排队任务的时候，可以使用提供的两种方法，remove和purge来帮助回收内存。

Finalization：在一个程序中如果不再对线程池进行引用，或者没有剩余的线程，线程池将自动shutdown。如果即使用户忘记调用shutdown的情况下，如果你想确保回收未引用的线程池，则必须使用0核心线程的下限来设置适当的keepAlive时间，以使未使用的线程最终消亡。通过allowCoreThreadTimeOut设置。扩展示例：此类的大多数扩展都覆盖一个或者多个受保护的hook方法，例如：以下是一个子类，它添加了一个简单的暂停/继续功能：

class PausableThreadPoolExecutor extends ThreadPoolExecutor {
   private boolean isPaused;
    private ReentrantLock pauseLock = new ReentrantLock();
    private Condition unpaused = pauseLock.newCondition();
 
    public PausableThreadPoolExecutor(...) { super(...); }
 
    protected void beforeExecute(Thread t, Runnable r) {
      super.beforeExecute(t, r);
      pauseLock.lock();
      try {
        while (isPaused) unpaused.await();
      } catch (InterruptedException ie) {
        t.interrupt();
      } finally {
        pauseLock.unlock();
      }
    }
 
    public void pause() {
      pauseLock.lock();
      try {
        isPaused = true;
      } finally {
        pauseLock.unlock();
      }
    }
 
    public void resume() {
      pauseLock.lock();
      try {
        isPaused = false;
        unpaused.signalAll();
      } finally {
        pauseLock.unlock();
      }
    }
  }}

1.2 成员变量及常量

在类的内部，还有大段关于线程池状态的注释：

/**
 * The main pool control state, ctl, is an atomic integer packing
 * two conceptual fields
 *   workerCount, indicating the effective number of threads
 *   runState,    indicating whether running, shutting down etc
 *
 * In order to pack them into one int, we limit workerCount to
 * (2^29)-1 (about 500 million) threads rather than (2^31)-1 (2
 * billion) otherwise representable. If this is ever an issue in
 * the future, the variable can be changed to be an AtomicLong,
 * and the shift/mask constants below adjusted. But until the need
 * arises, this code is a bit faster and simpler using an int.
 *
 * The workerCount is the number of workers that have been
 * permitted to start and not permitted to stop.  The value may be
 * transiently different from the actual number of live threads,
 * for example when a ThreadFactory fails to create a thread when
 * asked, and when exiting threads are still performing
 * bookkeeping before terminating. The user-visible pool size is
 * reported as the current size of the workers set.
 *
 * The runState provides the main lifecycle control, taking on values:
 *
 *   RUNNING:  Accept new tasks and process queued tasks
 *   SHUTDOWN: Don't accept new tasks, but process queued tasks
 *   STOP:     Don't accept new tasks, don't process queued tasks,
 *             and interrupt in-progress tasks
 *   TIDYING:  All tasks have terminated, workerCount is zero,
 *             the thread transitioning to state TIDYING
 *             will run the terminated() hook method
 *   TERMINATED: terminated() has completed
 *
 * The numerical order among these values matters, to allow
 * ordered comparisons. The runState monotonically increases over
 * time, but need not hit each state. The transitions are:
 *
 * RUNNING -> SHUTDOWN
 *    On invocation of shutdown(), perhaps implicitly in finalize()
 * (RUNNING or SHUTDOWN) -> STOP
 *    On invocation of shutdownNow()
 * SHUTDOWN -> TIDYING
 *    When both queue and pool are empty
 * STOP -> TIDYING
 *    When pool is empty
 * TIDYING -> TERMINATED
 *    When the terminated() hook method has completed
 *
 * Threads waiting in awaitTermination() will return when the
 * state reaches TERMINATED.
 *
 * Detecting the transition from SHUTDOWN to TIDYING is less
 * straightforward than you'd like because the queue may become
 * empty after non-empty and vice versa during SHUTDOWN state, but
 * we can only terminate if, after seeing that it is empty, we see
 * that workerCount is 0 (which sometimes entails a recheck -- see
 * below).
 */

大意为；主线程池的控制状态，ctl,是一个打包两个概念的字段workerCount的原子整数。以及指示线程有效运行数量的runState。用来指示是否运行和关闭等状态。为了将他们打包到一个int上，我们将workerCount限制为（2 ^ 29 ）-1约为5亿个线程。而不是（2 ^ 31）-1（20亿）个线程。如果将来有问题，可以使用AtomicLong进行替换。并调整shift / mask常数。但是在需要之前，使用int可以使代码更快。更简单。workCount是允许被启动不停止的worker数量，该值可能与活动线程的实际数量有暂时的不同，例如，ThreadFactory在被请求的时候未能创建线程，以及当退出线程任在终止之前执行记账操作等，用户可见的池大小报告为工作集合的当前大小。 runState提供了主生命周期控制，其值为:

RUNNING：接收新任务并处理排队的任务。
SHUTDOWN：不接收新任务，但是处理排队的任务。
STOP：不接收新任务，不处理排队任务，并中断正在进行的任务。
TIDYING：所有任务已终止，workerCount为零，转换到状态TIDYING的线程将运行Terminated（）钩子方法。
TERMINATED：terminated()方法已完成。

这些值之间的数字顺序很重要，可以进行有序的比较。runState随时间单调增加，但不必达到每个状态。可以按如下方式过渡：

RUNNING -> SHUTDOWN：关于shutdown()的调用，可能在finalize()中隐式地调用。
(RUNNING or SHUTDOWN) -> STOP：shutdownNow()相关调用。
STOP -> TIDYING 当pool为空的时候。
TIDYING -> TERMINATED： terminated()被调用的时候。在waittermination()中等待的线程将返回状态到达终止。检测从SHUTDOWN到TIDYING的转换并不像您想要的那样简单，因为在SHUTDOWN状态期间，队列在非空之后可能变为空，反之亦然，但是只有在看到它为空之后才能看到workerCount为0（有时需要重新检查-参见下文）。

1.2.1 ctl

ctl是ThreadPoolExecutor对线程状态runState和线程池workerCount的一个综合字段。将这两个属性打包放置在AtomicInteger上，ctl长度为32位，前面3位用于存放runState，后面29位存放workerCount。因此线程池最大的线程数量位2^29-1。其代码如下：

private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY   = (1 << COUNT_BITS) - 1;

// runState is stored in the high-order bits
private static final int RUNNING    = -1 << COUNT_BITS;
private static final int SHUTDOWN   =  0 << COUNT_BITS;
private static final int STOP       =  1 << COUNT_BITS;
private static final int TIDYING    =  2 << COUNT_BITS;
private static final int TERMINATED =  3 << COUNT_BITS;

// Packing and unpacking ctl
private static int runStateOf(int c)     { return c & ~CAPACITY; }
private static int workerCountOf(int c)  { return c & CAPACITY; }
private static int ctlOf(int rs, int wc) { return rs | wc; }

/*
 * Bit field accessors that don't require unpacking ctl.
 * These depend on the bit layout and on workerCount being never negative.
 */

private static boolean runStateLessThan(int c, int s) {
    return c < s;
}

private static boolean runStateAtLeast(int c, int s) {
    return c >= s;
}

private static boolean isRunning(int c) {
    return c < SHUTDOWN;
}

/**
 * Attempts to CAS-increment the workerCount field of ctl.
 */
private boolean compareAndIncrementWorkerCount(int expect) {
    return ctl.compareAndSet(expect, expect + 1);
}

/**
 * Attempts to CAS-decrement the workerCount field of ctl.
 */
private boolean compareAndDecrementWorkerCount(int expect) {
    return ctl.compareAndSet(expect, expect - 1);
}

/**
 * Decrements the workerCount field of ctl. This is called only on
 * abrupt termination of a thread (see processWorkerExit). Other
 * decrements are performed within getTask.
 */
private void decrementWorkerCount() {
    do {} while (! compareAndDecrementWorkerCount(ctl.get()));
}

我们可以看看各状态的二进制情况：

可以看到，通过位移操作，将最高的三位用于标识runState状态，而后面29位用于存放workerCount. 方法runStateOf和workerCountOf则可以将这两部分数据通过位运算的方式取出来：如下图所示，假定需要计算的c为0110 0000 0000 0000 0000 1110 0110 0000，那么位移过程如下：

这两个方法就可以很快的计算出结果。而ctlOf方法，rs | wc，这就非常容易理解了。

上图就非常方便的将runState和workerCount进行了组合。而根据第一张图可以看到，RUNNING状态为负数，是最小的，这些状态的全部ctl满足如下规则：

RUNNING < SHUTDOWN < STOP < TIDYING < TERMINATED

方法runStateLessThan和runStateAtLeast则是根据上述规则判断线程池当前所处的状态。可以发现的是，比SHUTDOWN小的任何状态都是iRUNNING状态。这也是isRunning方法的原因。此外还提供了基于CAS的增减WorkerCount的方法： compareAndIncrementWorkerCount、compareAndDecrementWorkerCount和decrementWorkerCount。

1.2.2 workQueue

/**
 * The queue used for holding tasks and handing off to worker
 * threads.  We do not require that workQueue.poll() returning
 * null necessarily means that workQueue.isEmpty(), so rely
 * solely on isEmpty to see if the queue is empty (which we must
 * do for example when deciding whether to transition from
 * SHUTDOWN to TIDYING).  This accommodates special-purpose
 * queues such as DelayQueues for which poll() is allowed to
 * return null even if it may later return non-null when delays
 * expire.
 */
private final BlockingQueue<Runnable> workQueue;

workQueue是用于保留任务并移交给工作线程的队列，我们不要指望通过workQueue.poll返回为null来判断队列是否为空，我们仅仅通过workQueue.isEmpty方法来判断队列是否为空，（将SHUTDOWN状态过度到TIDYING状态的时候可以采用这样做。）因为一些特殊的队列如DelayQueues允许poll的时候返回空，即使在延迟后的档期可以非空。这个workQueue依赖于构造函数传入。

1.2.3 workers

/**
 * Set containing all worker threads in pool. Accessed only when
 * holding mainLock.
 */
private final HashSet<Worker> workers = new HashSet<Worker>();

线程池中所有工作线程的集合。访问workers的时候需要获得锁mainLock。这个workers是hashSet。

1.2.4 mainLock & termination

/**
 * Lock held on access to workers set and related bookkeeping.
 * While we could use a concurrent set of some sort, it turns out
 * to be generally preferable to use a lock. Among the reasons is
 * that this serializes interruptIdleWorkers, which avoids
 * unnecessary interrupt storms, especially during shutdown.
 * Otherwise exiting threads would concurrently interrupt those
 * that have not yet interrupted. It also simplifies some of the
 * associated statistics bookkeeping of largestPoolSize etc. We
 * also hold mainLock on shutdown and shutdownNow, for the sake of
 * ensuring workers set is stable while separately checking
 * permission to interrupt and actually interrupting.
 */
private final ReentrantLock mainLock = new ReentrantLock();
    /**
 * Wait condition to support awaitTermination
 */
private final Condition termination = mainLock.newCondition();

调用这个锁的时候需要锁定worker的位置，并进行相关的记录，尽管我们使用某种并发集，但是事实证明，使用锁的方式通常是可取的。原因之一是它可以对interruptIdleWorkers进行序列化。从而可以避免不必要的中断风暴。尤其是在关机期间，否则，退出线程将同时中断哪些尚未中断的线程，它也简化了一些相关的数据。如maximumPoolSize。我们还在shutdown和shutdownNow上保留了mainLock，以确保在单独检查中断和实际中断权限的时候，workerSet是稳定的。 termination是一个wait条件，以支持awaitTermination。

1.2.5 其他成员变量

/**
 * Tracks largest attained pool size. Accessed only under
 * mainLock.
 */
private int largestPoolSize;

/**
 * Counter for completed tasks. Updated only on termination of
 * worker threads. Accessed only under mainLock.
 */
private long completedTaskCount;

/*
 * All user control parameters are declared as volatiles so that
 * ongoing actions are based on freshest values, but without need
 * for locking, since no internal invariants depend on them
 * changing synchronously with respect to other actions.
 */

/**
 * Factory for new threads. All threads are created using this
 * factory (via method addWorker).  All callers must be prepared
 * for addWorker to fail, which may reflect a system or user's
 * policy limiting the number of threads.  Even though it is not
 * treated as an error, failure to create threads may result in
 * new tasks being rejected or existing ones remaining stuck in
 * the queue.
 *
 * We go further and preserve pool invariants even in the face of
 * errors such as OutOfMemoryError, that might be thrown while
 * trying to create threads.  Such errors are rather common due to
 * the need to allocate a native stack in Thread.start, and users
 * will want to perform clean pool shutdown to clean up.  There
 * will likely be enough memory available for the cleanup code to
 * complete without encountering yet another OutOfMemoryError.
 */
private volatile ThreadFactory threadFactory;

/**
 * Handler called when saturated or shutdown in execute.
 */
private volatile RejectedExecutionHandler handler;

/**
 * Timeout in nanoseconds for idle threads waiting for work.
 * Threads use this timeout when there are more than corePoolSize
 * present or if allowCoreThreadTimeOut. Otherwise they wait
 * forever for new work.
 */
private volatile long keepAliveTime;

/**
 * If false (default), core threads stay alive even when idle.
 * If true, core threads use keepAliveTime to time out waiting
 * for work.
 */
private volatile boolean allowCoreThreadTimeOut;

/**
 * Core pool size is the minimum number of workers to keep alive
 * (and not allow to time out etc) unless allowCoreThreadTimeOut
 * is set, in which case the minimum is zero.
 */
private volatile int corePoolSize;

/**
 * Maximum pool size. Note that the actual maximum is internally
 * bounded by CAPACITY.
 */
private volatile int maximumPoolSize;

/**
 * The default rejected execution handler
 */
private static final RejectedExecutionHandler defaultHandler =
    new AbortPolicy();

/**
 * Permission required for callers of shutdown and shutdownNow.
 * We additionally require (see checkShutdownAccess) that callers
 * have permission to actually interrupt threads in the worker set
 * (as governed by Thread.interrupt, which relies on
 * ThreadGroup.checkAccess, which in turn relies on
 * SecurityManager.checkAccess). Shutdowns are attempted only if
 * these checks pass.
 *
 * All actual invocations of Thread.interrupt (see
 * interruptIdleWorkers and interruptWorkers) ignore
 * SecurityExceptions, meaning that the attempted interrupts
 * silently fail. In the case of shutdown, they should not fail
 * unless the SecurityManager has inconsistent policies, sometimes
 * allowing access to a thread and sometimes not. In such cases,
 * failure to actually interrupt threads may disable or delay full
 * termination. Other uses of interruptIdleWorkers are advisory,
 * and failure to actually interrupt will merely delay response to
 * configuration changes so is not handled exceptionally.
 */
private static final RuntimePermission shutdownPerm =
    new RuntimePermission("modifyThread");

/* The context to be used when executing the finalizer, or null. */
private final AccessControlContext acc;

类中还有部分其他的成员变量，整理如下：

变量名	类型	说明
largestPoolSize	int	线程池的大小，需要获得mainLock
completedTaskCount	long	已完成的任务的计数器，在工作线程终止的时候更新，也需要获得mainLock
threadFactory	volatile ThreadFactory	创建线程的工厂方法，需要在构造函数中传入
handler	volatile RejectedExecutionHandler	拒绝策略的调用方法，在线程池饱和或者关闭的时候如果有任务传入就调用
keepAliveTime	volatile long	以纳秒为单位的worker的等待超时时间。当当前大于corePoolSize或者allowCoreThreadTimeOut的时候，线程调用此超时，否则会永远等待新的worker
allowCoreThreadTimeOut	volatile boolean	如果为false，则核心线程即使在空闲的时候也保持活动，否则，核心线程将使用keepAliveTime来超时等待。默认值为false
corePoolSize	volatile int	核心池的大小是维持生存的worker的最小数量，除非设置了allowCoreThreadTimeOut，这种情况下，最小值为0
maximumPoolSize	volatile int	线程池大小的最大值，需要注意的是最大容量在内部是由CAPACITY决定的
defaultHandler	static final RejectedExecutionHandler	默认的拒绝策略：new AbortPolicy();
shutdownPerm	static final RuntimePermission	调用shutdown和shutdownNow所需要的权限，请参阅checkShutdownAccess要求调用者有权实际中断工作程序集中的线程，由Thread.interrupt控制，依赖于ThreadGroup.checkAccess。依次依赖于SecurityManager.checkAccess。仅在这些检查通过之后，才尝试执行shutdown。Thread.interrupt的所有实际调用，都会忽略SecurityExceptions。这意味着尝试中断会以静默的方式失败。除非SecurityManager的策略不一致。否则它们不应失败。在这种情况下，无法真正中断线程可能会禁用或延迟完全终止。建议使用interruptIdleWorkers的其他用途，而实际中断的失败只会延迟对配置更改的响应，因此不会进行特殊处理。

2.重要的内部类

在ThreadPoolExecutor中，内部类有两类，一类是执行线程的Worker，还有一类是拒绝策略。

2.1 Worker

Worker这个类，继承了AbstractQueueSynchronizer。

其代码如下：

/**
 * Class Worker mainly maintains interrupt control state for
 * threads running tasks, along with other minor bookkeeping.
 * This class opportunistically extends AbstractQueuedSynchronizer
 * to simplify acquiring and releasing a lock surrounding each
 * task execution.  This protects against interrupts that are
 * intended to wake up a worker thread waiting for a task from
 * instead interrupting a task being run.  We implement a simple
 * non-reentrant mutual exclusion lock rather than use
 * ReentrantLock because we do not want worker tasks to be able to
 * reacquire the lock when they invoke pool control methods like
 * setCorePoolSize.  Additionally, to suppress interrupts until
 * the thread actually starts running tasks, we initialize lock
 * state to a negative value, and clear it upon start (in
 * runWorker).
 */
private final class Worker
    extends AbstractQueuedSynchronizer
    implements Runnable
{
    /**
     * This class will never be serialized, but we provide a
     * serialVersionUID to suppress a javac warning.
     */
    private static final long serialVersionUID = 6138294804551838833L;

    //当前worker工作的线程，如果factory方法失败则为空
    /** Thread this worker is running in.  Null if factory fails. */
    final Thread thread;
    //需要运行的初始化任务，可能是空
    /** Initial task to run.  Possibly null. */
    Runnable firstTask;
    //线程任务计数器
    /** Per-thread task counter */
    volatile long completedTasks;

    /**
     * Creates with given first task and thread from ThreadFactory.
     * @param firstTask the first task (null if none)
     */
     //从ThreadFactory创建线程给firstTask任务
    Worker(Runnable firstTask) {
       //在运行worker之前禁止中断
        setState(-1); // inhibit interrupts until runWorker
        this.firstTask = firstTask;
        //调用ThreadFactory
        this.thread = getThreadFactory().newThread(this);
    }

    /** Delegates main run loop to outer runWorker  */
    //将main的run委托给外部的runWorker方法。
    public void run() {
        runWorker(this);
    }

    // Lock methods
    //
    // The value 0 represents the unlocked state.
    // The value 1 represents the locked state.
    //0标识未锁定，1标识锁定
    protected boolean isHeldExclusively() {
        return getState() != 0;
    }

    //尝试获得锁
    protected boolean tryAcquire(int unused) {
        if (compareAndSetState(0, 1)) {
            setExclusiveOwnerThread(Thread.currentThread());
            return true;
        }
        return false;
    }
    //尝试释放锁
    protected boolean tryRelease(int unused) {
        setExclusiveOwnerThread(null);
        setState(0);
        return true;
    }

    public void lock()        { acquire(1); }
    public boolean tryLock()  { return tryAcquire(1); }
    public void unlock()      { release(1); }
    public boolean isLocked() { return isHeldExclusively(); }
    //将所有线程中断
    void interruptIfStarted() {
        Thread t;
        if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
            try {
                t.interrupt();
            } catch (SecurityException ignore) {
            }
        }
    }
}

其注释大意为： worker类主要维护运行任务的线程的中断控制的状态，以及对各种情况进行记录，这个类继承了AbstractQueuedSynchronizer，以简化获取和释放围绕每个任务的锁，这样可以防止中断，这些中断旨在唤醒等待任务的工作线程，而不是中断正在运行的任务。我们实现了一个简单的不可重入锁，而不是使用ReentrantLock,因为我们不希望工作线程在调用诸如setCorePoolSize这样的池控制方法的时候能够重新获得锁。此外，为了在线程实际开始运行之前抑制中断。我们将锁的初始化状态设置为负值，并在启动的时候将其清除。实际上，比较特殊的是Worker继承了AQS，并且实现了Runnable接口，使用firstTask来保存传入的任务，thread则是使用ThreadFactory来创建的线程。这个线程来处理任务。在调用构造方法的时候，将任务传入，通过getThreadFactory().newThread(this);创建一个新线程。这个worker对象在启动的时候会调用其run方法，因为worker实现了Runnable接口，其本身也是一个线程。为什么使用继承AQS而不是使用ReentrantLock呢，关键是在于tryAcquire这个方法是不允许重入的。而ReentrantLock则是允许重入。 lock方法一旦获取的独占锁，则标识当前线程正在执行任务。Worker继承了AQS，因此自身也是一把锁。在执行任务的过程中，不会释放锁。用来保证任务的正常执行。因此，任务在运行的过程中，是不能被中断的。如果Worker不是独占锁，也是空闲状态，则说明这个Worker没有处理任务，可以对其进行中断。线程池在执行shutdown和tryTerminate的时候会对空闲的线程进行中断。interruptIdleWorkers方法会使用tryLock方法来判断线程池中的线程是否是空闲状态。

2.2 拒绝策略类

在ThreadPoolExecutor中，支持的拒绝策略主要有4种，都是通过内部类提供的。分别是CallerRunsPolicy、AbortPolicy、DiscardPolicy、DiscardOldestPolicy。这些类都实现了RejectedExecutionHandler接口。

2.2.1 CallerRunsPolicy

/**
 * A handler for rejected tasks that runs the rejected task
 * directly in the calling thread of the {@code execute} method,
 * unless the executor has been shut down, in which case the task
 * is discarded.
 */
public static class CallerRunsPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code CallerRunsPolicy}.
     */
    public CallerRunsPolicy() { }

    /**
     * Executes task r in the caller's thread, unless the executor
     * has been shut down, in which case the task is discarded.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
       //判断线程池的状态，如果没有关闭，则用提交任务的线程来执行run方法
        if (!e.isShutdown()) {
            r.run();
        }
    }
}

这个拒绝策略采用执行exec的线程来运行任务，除非当前线程池处于关闭状态。这个拒绝策略在正常情况下不会导致任务失败，可以有效的降低生产任务的速度。用户根据场景自行选择。

2.2.2 AbortPolicy

/**
 * A handler for rejected tasks that throws a
 * {@code RejectedExecutionException}.
 */
public static class AbortPolicy implements RejectedExecutionHandler {
    /**
     * Creates an {@code AbortPolicy}.
     */
    public AbortPolicy() { }

    /**
     * Always throws RejectedExecutionException.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     * @throws RejectedExecutionException always
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
      //抛出异常
        throw new RejectedExecutionException("Task " + r.toString() +
                                             " rejected from " +
                                             e.toString());
    }
}

这个拒绝策略会拒绝任务的执行，直接抛出异常。

2.2.3 DiscardPolicy

/**
 * A handler for rejected tasks that silently discards the
 * rejected task.
 */
public static class DiscardPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code DiscardPolicy}.
     */
    public DiscardPolicy() { }

    /**
     * Does nothing, which has the effect of discarding task r.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
    }
}

该拒绝策略会丢弃当前任务，并且什么都不做。

2.2.4 DiscardOldestPolicy

/**
 * A handler for rejected tasks that discards the oldest unhandled
 * request and then retries {@code execute}, unless the executor
 * is shut down, in which case the task is discarded.
 */
public static class DiscardOldestPolicy implements RejectedExecutionHandler {
    /**
     * Creates a {@code DiscardOldestPolicy} for the given executor.
     */
    public DiscardOldestPolicy() { }

    /**
     * Obtains and ignores the next task that the executor
     * would otherwise execute, if one is immediately available,
     * and then retries execution of task r, unless the executor
     * is shut down, in which case task r is instead discarded.
     *
     * @param r the runnable task requested to be executed
     * @param e the executor attempting to execute this task
     */
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
       //如果线程池未关闭，则将线程池任务队列中的旧任务poll丢弃，然后将当前任务调用execute添加到队列。
        if (!e.isShutdown()) {
            e.getQueue().poll();
            e.execute(r);
        }
    }
}

这个拒绝策略将当队列中的旧任务丢弃，之后将当前任务添加到队列，但是这个拒绝策略并不能保证当前任务能执行，还是会有可能被后续的任务继续丢弃。

3. 构造函数

ThreadpoolExecutor提供了4种构造函数，分别对前面可变的7个成员变量进行赋值。

3.1 ThreadPoolExecutor(7参数)

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler) {
     //  corePoolSize必须大于等于0  maximumPoolSize必须大于0   且maximumPoolSize>=corePoolSize ,否则参数非法异常。                
    if (corePoolSize < 0 ||
        maximumPoolSize <= 0 ||
        maximumPoolSize < corePoolSize ||
        keepAliveTime < 0)
        throw new IllegalArgumentException();
    if (workQueue == null || threadFactory == null || handler == null)
        throw new NullPointerException();
    //访问控制权限
    this.acc = System.getSecurityManager() == null ?
            null :
            AccessController.getContext();
    //赋值
    this.corePoolSize = corePoolSize;
    this.maximumPoolSize = maximumPoolSize;
    this.workQueue = workQueue;
    this.keepAliveTime = unit.toNanos(keepAliveTime);
    this.threadFactory = threadFactory;
    this.handler = handler;
}

实际上对应系统的变量只有6个，这是因为，keepAliveTime和unit最终都转换为纳秒数unit.toNanos(keepAliveTime)。

变量	说明
corePoolSize	线程池常驻线程数，如果设置了allowCoreThreadTimeOut，则常驻线程在一定时间之后就会变成0,范围0<=corePoolSize<=maximumPoolSize
maximumPoolSize	线程池允许的最大线程数，其范围0<maximumPoolSize<=(2^29-1)
keepAliveTime	当线程数大于核心线程corePoolSize的时候，这些多余的线程在当前任务终止之后最长的驻留时间。
unit	keepAliveTime的单位，最终以纳秒在系统中生效
workQueue	任务队列，依赖于构造函数传入
threadFactory	产生线程的工厂方法
handler	拒绝策略，如果任务达到任务队列的长度将会触发拒绝策略，拒绝策略有四种

3.2 ThreadPoolExecutor(默认threadFactory)

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          RejectedExecutionHandler handler) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         Executors.defaultThreadFactory(), handler);
}

实际上调用的还是3.1中的七参数的构造函数，只是使用了默认的Executors.defaultThreadFactory()做为线程产生的工厂方法。这个方法实际上在Executors中。

public static ThreadFactory defaultThreadFactory() {
    return new DefaultThreadFactory();
}
DefaultThreadFactory() {
    SecurityManager s = System.getSecurityManager();
    group = (s != null) ? s.getThreadGroup() :
                          Thread.currentThread().getThreadGroup();
    namePrefix = "pool-" +
                  poolNumber.getAndIncrement() +
                 "-thread-";
}

3.3 ThreadPoolExecutor(默认defaultHandler)

这个方法将采用默认的拒绝策略：

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         threadFactory, defaultHandler);
}

而默认的拒绝策略在前面变量和常量中已经做了说明：

private static final RejectedExecutionHandler defaultHandler =
        new AbortPolicy();

实际上采用的是丢弃当前任务的策略。

3.4 ThreadPoolExecutor(默认defaultThreadFactory和defaultHandler)

那么自然根据上述两个构造函数可以得到：

public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue) {
    this(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue,
         Executors.defaultThreadFactory(), defaultHandler);
}

将defaultThreadFactory和defaultHandler都采用默认值。因此这个构造方法是我们自己手动new ThreadPoolExecutor的时候会经常使用的。

4.基本原理

实际上在了解了前面的成员变量以及注释和构造函数之后，ThreadPoolExecutor的基本结构就非常清除了。个人觉得相比concurrentHashMap要简单很多。ThreadpoolExecutor的组成如下：

由一个HashSet为数据结构的workersPool来存放所有的线程。然后将所有Runnable的任务都放在构造函数定义的workQueue中，这个workQueue可以是任意的阻塞队列。总之可以自行定义。那么，当线程池正常启动之后，这个数据结构就开始工作。当外部线程添加一个Runnable的task提交sumbit方法的时候。此时，submit方法中有三个选项：

1.新线程执行：一是判断，当前pool中的线程数量如果小于corePoolSize那么会调用构造函数定义的线程创建的工厂方法来创建一个线程。将这个线程添加到workers中。用这个新线程执行任务，即便works中有线程空闲。

2.加入队列等待：当提交任务的时候，当线程池中workers的数量达到corePoolSize的时候，如果此时workQueue中不满，则将任务存入workQueue中。添加到队列的尾部。

3.如果任务队列已满，而corePoolSize小于maxmumCoreSize，则用构造函数提供的构造方法创建一个新的线程，用这个新线程来执行提交的任务。

4.线程池workPool中的workers在执行完当前任务之后处于空闲的情况下，将从workQueue中取出队首的任务。

5.如果线程池workPool达到maxmumPoolSize，同时workQueue中的任务也存满，达到最大值的时候，此时将触发拒绝策略。执行任务的线程根据上述的4种拒绝策略来执行是否将任务丢弃或者用当前线程执行任务。

6.如果allowCoreThreadTimeOut没有设置，默认为false，则空闲时间超过keepAliveTime的线程将会自动销毁。以保持线程池中的线程数维持在corePoolSize。

7.如果设置了allowCoreThreadTimeOut为true，则所有的线程在时间超过keepAliveTime之后，都会自动销毁。如果线程池空闲，则线程池不会占用任何资源。

以上7点是ThreadPoolExecutor的核心。也是面试过程中经常被问到的部分。结合在第一部分中线程的状态，各状态之间的切换如下图：

5.重要方法

在理解了线程池工作的基本原理之后，现在对线程池的一些常用方法进行分析。

5.1 execute

public void execute(Runnable command) {
    //如果任务为空，抛出异常
    if (command == null)
        throw new NullPointerException();
    /*
     * Proceed in 3 steps:
     *
     * 1. If fewer than corePoolSize threads are running, try to
     * start a new thread with the given command as its first
     * task.  The call to addWorker atomically checks runState and
     * workerCount, and so prevents false alarms that would add
     * threads when it shouldn't, by returning false.
     *
     * 2. If a task can be successfully queued, then we still need
     * to double-check whether we should have added a thread
     * (because existing ones died since last checking) or that
     * the pool shut down since entry into this method. So we
     * recheck state and if necessary roll back the enqueuing if
     * stopped, or start a new thread if there are none.
     *
     * 3. If we cannot queue task, then we try to add a new
     * thread.  If it fails, we know we are shut down or saturated
     * and so reject the task.
     */
     //对于任务的处理，有三个步骤来处理，如果当前工作线程低于corepoolSize，则创建一个新的线程
    int c = ctl.get();
    if (workerCountOf(c) < corePoolSize) {
    //创建新线程执行任务，成功则返回，否则再次获取线程池的状态和线程数
        if (addWorker(command, true))
            return;
        c = ctl.get();
    }
    //如果线程池处于run状态，且工作队列workQueue可以添加
    if (isRunning(c) && workQueue.offer(command)) {
      //再次获取线程池状态和线程数
        int recheck = ctl.get();
        //判断运行状态，如果不处于运行状态且可以从阻塞队列中删除任务。则执行拒绝策略
        if (! isRunning(recheck) && remove(command))
           //执行拒绝策略
            reject(command);
        //如果线程池中的线程数量为0，则创建线程
        else if (workerCountOf(recheck) == 0)
            addWorker(null, false);
    }
    //如果阻塞队列已满，无法添加，则执行拒绝策略
    else if (!addWorker(command, false))
        reject(command);
}

上述过程可以用流程图表示如下：

5.2 addWorker

private boolean addWorker(Runnable firstTask, boolean core) {
    retry:
    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c);

        // Check if queue empty only if necessary.
        //检查线程池状态，如果处于SHUTDOWN的时候firstTask为null和workQueue不为空。或者线程池状态大于SHUTDOWN，这些状态下都不允许创建线程，因此直接返回false。此处相当于double check。
        if (rs >= SHUTDOWN &&
            ! (rs == SHUTDOWN &&
               firstTask == null &&
               ! workQueue.isEmpty()))
            return false;
        //死循环 采用CAS的方式修改count
        for (;;) {
            //获取count 这个方法是采用位运算
            int wc = workerCountOf(c);
            //如果数量大于总体容量2^29-1
            if (wc >= CAPACITY ||
                //此处根据传入的core决定通过corePoolSize还是maxmumPoolSize判断
                wc >= (core ? corePoolSize : maximumPoolSize))
                //上述这些条件如果为真则返回false
                return false;
            //通过cas的方式修改count，如果修改成功，则跳出死循环
            if (compareAndIncrementWorkerCount(c))
                //注意此处break和continue的区别  break是跳出当前的retry而continue则是继续执行
                break retry;
            c = ctl.get();  // Re-read ctl
            if (runStateOf(c) != rs)
                continue retry;
            // else CAS failed due to workerCount change; retry inner loop
        }
    }
    //初始化变量
    boolean workerStarted = false;
    boolean workerAdded = false;
    Worker w = null;
    try {
        //创建一个worker 其task即是firstTask
        w = new Worker(firstTask);
        //线程t指向worker的线程
        final Thread t = w.thread;
        //如果线程不为空
        if (t != null) {
           //获取mainLock
            final ReentrantLock mainLock = this.mainLock;
            mainLock.lock();
            try {
                // Recheck while holding lock.
                // Back out on ThreadFactory failure or if
                // shut down before lock acquired.
                //获取线程池状态
                int rs = runStateOf(ctl.get());
                //如果rs为RUNNING状态或者处于SHUTDOWN的时候，firstTask为空
                if (rs < SHUTDOWN ||
                    (rs == SHUTDOWN && firstTask == null)) {
                    //线程是否处于活动状态  如果线程以及run了则说明该线程已经启动，则会出问题，因为这个worker是新new的，并没有运行线程的run
                    if (t.isAlive()) // precheck that t is startable
                        throw new IllegalThreadStateException();
                    //将worker添加到workers
                    workers.add(w);
                    //计算workers的size
                    int s = workers.size();
                    //判断size是否大于lagestPoolSize,那么修改LargestPoolSize
                    if (s > largestPoolSize)
                        largestPoolSize = s;
                    //worker的添加状态为true
                    workerAdded = true;
                }
            //不要忘记解锁
            } finally {
                mainLock.unlock();
            }
            //如果线程加入pool成功，则启动线程，并修改线程的启动状态为true
            if (workerAdded) {
                t.start();
                workerStarted = true;
            }
        }
    } finally {
        //如果线程启动状态不为true，则判定启动失败，将worker添加到失败的worker中
        if (! workerStarted)
            addWorkerFailed(w);
    }
    //返回worker的启动状态做为线程的执行结果
    return workerStarted;
}

经过这个方法，我们可以看到，实际上mainLock是在向workers的pool中添加和删除队列的时候才会用到。在添加或者删除之前要获取这个锁。另外，本文开始定义了标签语句 retry: ,需要注意的是，通过continue会结束当前的循环重新开始retry。而break则会跳出retry。结束循环。

5.3 runWorker

我们在前面静态内部类部分，对worker进行了分析。那么实际上，worker在被启动之后，是怎么执行的呢？

public void run() {
    runWorker(this);
}

实际上调用的就是这个runWorker方法。

final void runWorker(Worker w) {
    //定义线程为wt
    Thread wt = Thread.currentThread();
    //定义任务为task 
    Runnable task = w.firstTask;
    w.firstTask = null;
    //此处调用unlock，这不是mainLock,这是worker继承了AQS，实际上是AQS中的release方法，还记得所有的worker中被初始化的状态是-1吗？此处调用release方法，就将-1改为了1，这样后面的中断方法就能对处于运行状态的线程进行中断
    w.unlock(); // allow interrupts
    //
    boolean completedAbruptly = true;
    try {
        //如果task不为null,或者task执行getTask也不为空。需要注意的是此处的getTask,是从队列中去获取的。因此，此处的逻辑就是，如果这个worker有任务那么就执行，执行完之后就遍历队列，继续执行任务。
        while (task != null || (task = getTask()) != null) {
            //执行任务的过程中加锁，那么加锁之后就不允许中断了
            w.lock();
            // If pool is stopping, ensure thread is interrupted;
            // if not, ensure thread is not interrupted.  This
            // requires a recheck in second case to deal with
            // shutdownNow race while clearing interrupt
            //判断线程池运行状态，并结合线程的中断状态，之后对线程进行中断
            if ((runStateAtLeast(ctl.get(), STOP) ||
                 (Thread.interrupted() &&
                  runStateAtLeast(ctl.get(), STOP))) &&
                !wt.isInterrupted())
                wt.interrupt();
            try {
                //前置操作
                beforeExecute(wt, task);
                Throwable thrown = null;
                try {
                    //调用run方法开始执行
                    task.run();
                //异常处理
                } catch (RuntimeException x) {
                    thrown = x; throw x;
                } catch (Error x) {
                    thrown = x; throw x;
                } catch (Throwable x) {
                    thrown = x; throw new Error(x);
                } finally {
                    //后置操作 
                    afterExecute(task, thrown);
                }
            } finally {
                task = null;
                w.completedTasks++;
                w.unlock();
            }
        }
        completedAbruptly = false;
    } finally {
        //处理完之后，当线程退出的时候，触发processWorkerExit方法，从workers中移除当前的worker
        processWorkerExit(w, completedAbruptly);
    }
}

5.4 processWorkerExit

我们再看看上述方法提到的线程退出的时候执行的方法

private void processWorkerExit(Worker w, boolean completedAbruptly) {
    if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
        decrementWorkerCount();
    //获得锁
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        //更新completedTaskCount 每个worker会维护一个completedTasks，当worker退出的时候，更新到线程池中
        completedTaskCount += w.completedTasks;
        //从队列中移除worker
        workers.remove(w);
    } finally {
        //解锁
        mainLock.unlock();
    }
    //尝试线程池是否可以进入terminate
    tryTerminate();
    //获得锁状态和线程数
    int c = ctl.get();
    //判断锁状态
    if (runStateLessThan(c, STOP)) {
        if (!completedAbruptly) {
            int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
            if (min == 0 && ! workQueue.isEmpty())
                min = 1;
            if (workerCountOf(c) >= min)
                return; // replacement not needed
        }
        //添加worker
        addWorker(null, false);
    }
}

5.5 tryTerminate

这个方法是每次线程退出的时候就触发，尝试判断线程池是否可以Terminate

final void tryTerminate() {
    //死循环 
    for (;;) {
        //获得运行状态和线程数，之后进行条件判断，如果添加不满足则退出
        int c = ctl.get();
        if (isRunning(c) ||
            runStateAtLeast(c, TIDYING) ||
            (runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty()))
            return;
        //如果条件满足，但是workercount不为0，则对所有空闲的线程执行中断方法
        if (workerCountOf(c) != 0) { // Eligible to terminate
            interruptIdleWorkers(ONLY_ONE);
            return;
        }
        //获得锁
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            //采用cas进行更新
            if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
                try {
                    terminated();
                } finally {
                    ctl.set(ctlOf(TERMINATED, 0));
                    //这个termination 在awaitTermination中会等待keepAlive的时间。此处将唤醒这些等待的线程
                    termination.signalAll();
                }
                return;
            }
        } finally {
           //解锁
            mainLock.unlock();
        }
        // else retry on failed CAS
    }
}

5.6 awaitTermination

public boolean awaitTermination(long timeout, TimeUnit unit)
    throws InterruptedException {
    //将超时时间转换为纳秒
    long nanos = unit.toNanos(timeout);
    //获得锁
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        //死循环
        for (;;) {
            //判断运行状态，如果已经处于TERMINATED状态则直接退出
            if (runStateAtLeast(ctl.get(), TERMINATED))
                return true;
            //计算的纳秒时间必须大于0
            if (nanos <= 0)
                return false;
            //通过lock的条件变量等待，将当前线程变为TIME_WAITING状态
            nanos = termination.awaitNanos(nanos);
        }
    } finally {
        //解锁
        mainLock.unlock();
    }
}

同上述方法可以看出，mainLock是对worker操作的时候使用的，在添加和删除workers的时候需要获得锁。此外，当调用awaitTermination()方法的时候，对空闲的worker执行WAIT操作也采用lock的条件变量termination来执行。之后当线程进入TERMINATION状态的时候统一唤醒。

5.7 getTask

最后再分析一个关键方法。getTask,也就是前面的worker获取任务的方法。这个方法非常重要。

private Runnable getTask() {
    //是否需要使用超时时间 
    boolean timedOut = false; // Did the last poll() time out?
    //死循环
    for (;;) {
        //获得线程状态和count
        int c = ctl.get();
        int rs = runStateOf(c);

        // Check if queue empty only if necessary.
        //状态判断 当rs大于SHUTDOWN或者STOP状态的时候判断workQueue是否为空
        if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
            //缩减workerCount
            decrementWorkerCount();
            return null;
        }
        //获取wc
        int wc = workerCountOf(c);

        // Are workers subject to culling?
        //判断wc是否大于corePoolSize，决定timed的状态，也就是说，当wc小于corePoolSize的时候，就不考虑阻塞队列的超时
        boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
        //如果wc大于最大线程或者timed的状态不满足 则continue
        if ((wc > maximumPoolSize || (timed && timedOut))
            && (wc > 1 || workQueue.isEmpty())) {
            //如果cas减少workerCount失败则退出
            if (compareAndDecrementWorkerCount(c))
                return null;
            continue;
            //通过这个循环将wc减少
        } 

        try {
            //如果需要超时获取，则按超时的方式获取任务，这也是阻塞队列的作用，如果有超时时间则会一直阻塞
            Runnable r = timed ?
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                workQueue.take();
            //r为从队列中拿到的任务
            if (r != null)
                return r;
            timedOut = true;
            //如果被中断 则设置timedOut为false
        } catch (InterruptedException retry) {
            timedOut = false;
        }
    }
}

实际上通过这个方法的代码可以发现，之所以要使用阻塞队列，原因就在于，这个方法中如果需要通过使用keepAlive的时间，那么此方法就会用pull(timeout)方法来阻塞当前调用线程。这样就能将每个worker阻塞再getTask的过程中。重点需要注意getTask被阻塞的条件：

 allowCoreThreadTimeOut || wc > corePoolSize;

开启核心线程超时或者当前线程数大于核心线程。

5.8 interruptIdleWorkers

如果5.7方法中有线程进入了TIME_WAITING状态。那么如果需要及时使用，那么就需要对这些阻塞的线程进行中断。

private void interruptIdleWorkers(boolean onlyOne) {
    //获得锁
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        //遍历workers
        for (Worker w : workers) {
            Thread t = w.thread;
            //获得AQS的锁，如果能获得AQS的锁且不处于中断状态则进行中断
            if (!t.isInterrupted() && w.tryLock()) {
                try {
                   //调用线程的中断方法
                    t.interrupt();
                } catch (SecurityException ignore) {
                } finally {
                   //worker的AQS解锁
                    w.unlock();
                }
            }
            //是否只执行1次的标识
            if (onlyOne)
                break;
        }
    } finally {
        //解锁
        mainLock.unlock();
    }
}

调用中断的时候，需要回去两层锁，一层是mainLock，另外一层是线程的AQS锁，如果被占用则该线程不会被打断。这样一来就可以明白，再最开始刚创建的worker是不会被打断的，另外处于工作中的线程也不会被打算，只有wait状态的worker才会被打断。而interruptIdleWorkers方法被调用的时机：

tryTerminate
shutdown
setCorePoolSize -> workerCountOf(ctl.get()) > corePoolSize
allowCoreThreadTimeOut
setMaximumPoolSize -> workerCountOf(ctl.get()) > maximumPoolSize
setKeepAliveTime

6.总结

本文对ThreadPoolExecutor线程池的源码进行了分析。相对于ConcurrentHashMap这个类的代码并不是特别复杂。实际上ThreadpoolExecutor由一个hashSet构成的workerPool和一个自定义的阻塞队列workQueue组成。基本结构如下：

worker本身继承了AQS,然后还实现了Runnable接口。之后worker会不断的从任务队列workQueue中去获取任务。有两种获取任务的方法，getTask()。当开启了核心线程keepAlive或者当前线程数大于核心线程corepoolSize的时候，就会触发keepAliveTime的阻塞。被阻塞的线程就可以认为是空闲的线程，这些线程被阻塞队列阻塞住，这也是为什么线程池要用阻塞阻塞队列的原因。不需要被阻塞的线程则可以不断的执行run，源源不断的从工作队列workQueue中获取task执行。由于worker实现AQS，其作用是在于当大部分线程都处于休眠的时候，线程池中活动的线程降低到队列核心线程数以下之后，此时就通过中断的方式唤醒哪些没有工作的线程。在关闭或者调整线程池的时候都适用。而AQS的目的就是在打断的时候，需要获得AQS锁，而新创建的worker和处于正常运行中的worker则会导致获取锁失败，不会被打断。 ReentrantLocak mainLock的作用在于，对workers的操作，增减或者打断都需要获得这个锁才能执行。而这个锁上的条件变量termination的唯一作用是在tryTermination中调用。最后需要注意的是，线程池采用了大量的cas操作，最关键的是将线程的状态和workerSize合并到一个AotmicInteger对象上。通过位运算，最高三位标识状态。因此worker的大小也就是2^29-1。这个位运算的操作也是我们值得借鉴的地方。与HashMap中的位运算操作一样，都是让在看代码的时候觉得，代码还能这样写。特别提升设计能力的地方。

最关键的部分：

5个状态及切换过程
7个操作（基本原理一节）
4个拒绝策略
7个参数（构造函数）
用阻塞队列的原因
两种锁：AQS和mainLock
添加任务、gettask、执行任务、以及中断过程