監視器 (程序同步化)

監視器 (英語：Monitors，也称为监视器) 是一种程序结构，结构内的多个子程序（对象或模块）形成的多个工作线程互斥访问共享資源。這些共享資源一般是硬件或一群變數。管程实现了在一个时间点，最多只有一个线程在执行管程的某个子程序。与那些通过修改数据结构实现互斥访问的并发程序设计相比，管程实现很大程度上简化了程序设计。

管程提供了一种机制，线程可以临时放弃互斥访问，等待某些条件得到满足后，重新获得执行权恢复它的互斥访问。

管程是东尼·霍尔 ^[1] 与泊·派克·漢森 ^[2]提出的，并由泊·派克·漢森首次在并行Pascal中实现。东尼·霍尔证明了這與信号量是等價的。管程在当时也被用于單作業系統环境中的进程間通訊。

在程式語言Concurrent Pascal（英语：Concurrent Pascal），Pascal-Plus，Modula-2，Modula-3，Mesa以及Java中都提供這個功能。

管程实现对共享资源的互斥访问

一個監視器包含:

多个彼此可以交互並共用資源的线程
多个與資源使用有關的變數
一個互斥鎖
一個用來避免竞态条件的不變量

一個監視器的程序在執行一个线程前會先取得互斥鎖，直到完成线程或是线程等待某个條件被满足才會放弃互斥锁。若每個执行中的线程在放弃互斥鎖之前都能保證不變量成立，則所有线程皆不會導致竞态条件成立。

以下這個银行账户的提款／存款事务的監視器是個簡單的例子：

monitor class Account {
  private int balance := 0
  invariant balance >= 0

  public method boolean withdraw(int amount)
     precondition amount >= 0
  {
    if balance < amount then return false
    else { balance := balance - amount ; return true }
  }

  public method deposit(int amount)
     precondition amount >= 0
  {
    balance := balance + amount
  }
}

当一个线程执行管程中的一个子程序时，称为占用(occupy)该管程. 管程的实现确保了在一个时间点，最多只有一个线程占用了该管程。这是管程的互斥锁访问性质。

当线程要调用一个定义在管程中的子程序时，必须等到已经没有其它线程在执行管程中的某个子程序。

在管程的简单实现中，編譯器为每个管程对象自動加入一把私有的互斥锁。该互斥锁初始状态为解锁，在管程的每个公共子程序的入口给该互斥锁加锁，在管程的每个公共子程序的出口给该互斥锁解锁。

這個例子中的不變量是「任何操作執行前 balance 變數必須反映正確的餘額」。一般而言，不變量的條件不被寫在程式中，而在註解中有相關說明，然而Eiffel程序设计语言显式檢查不變量。

條件變數(Condition Variable)

对于许多应用场合，互斥操作是不够用的。线程可能需要等待某个条件 $P$ 为真，才能继续执行。在一个忙碌等待循环中

   while not(  $P$  ) do skip

将会导致所有其它进程都无法进入临界区使得该条件 $P$ 为真，该管程发生死锁.

解决办法是条件变量(condition variables). 概念上，一个条件变量就是一个线程队列(queue), 其中的线程正等待某个条件变为真。每个条件变量 $c$ 关联着一个断言 $P_{c}$ . 当一个线程等待一个条件变量，该线程不算作占用了该管程，因而其它线程可以进入该管程执行，改变管程的状态，通知条件变量 $c$ 其关联的断言 $P_{c}$ 在当前状态下为真.

因此对条件变量存在两种主要操作:

wait c 被一个线程调用，以等待断言 $P_{c}$ 被满足后该线程可恢复执行. 线程挂在该条件变量上等待时，不被认为是占用了管程.
signal c (有时写作notify c)被一个线程调用，以指出断言 $P_{c}$ 现在为真.

在下述例子中, 用管程实现了一个信号量. 一个私有整型变量s需要被互斥访问。管程中定义了子程序“增加”(V)与子程序“减少”(P)，整型变量s不能被减少到小于0; 因此子程序“减少”必须等到该整型变量是正数时才可执行. 使用条件变量sIsPositive与相关联的断言 $P_{sIsPositive}=(s>0)$ .

monitor class Semaphore
{
  private int s := 0
  invariant s >= 0
  private Condition sIsPositive /* associated with s > 0 */

  public method P()
  {
    if s = 0 then wait sIsPositive
    assert s > 0
    s := s - 1
  }

  public method V()
  {
    s := s + 1
    assert s > 0
    signal sIsPositive
  }
}

当一个通知(signal)发给了一个有线程处于等待中的条件变量，则有至少两个线程将要占用该管程: 发出通知的线程与等待该通知的某个线程. 只能有一个线程占用该管程，因此必须做出选择。两种理论体系导致了两种不同的条件变量的实现:

阻塞式条件变量(Blocking condition variables)，把优先级给了被通知的线程.
非阻塞式条件变量(Nonblocking condition variables)，把优先级给了发出通知的线程.

阻塞式条件变量

东尼·霍尔与泊·派克·汉森最早提出的是阻塞式条件变量. 发出通知(signaling)的线程必须等待被通知(signaled)的线程放弃占用管程（或者离开管程，或者等待某个条件变量）。使用阻塞式条件变量的管程被称为霍尔风格(Hoare-style)管程或通知且急迫等待(signal-and-urgent-wait)管程.

霍尔风格管程，有两个条件变量a与b. 根据Buhr et al.

设每个管程对象有两个线程队列

e是入口队列
s是已经发出通知的线程队列.

设对于每个条件变量 $c$ , 有一个线程队列

$c$ .q, 所有等待 $c$ 的线程的队列

这些队列会公平(fair)调度，甚至实现为先进先出.

各个环节实现如下 (规定各个环节彼此是互斥的. 因此restart一个线程，并不会立即执行，直到当前环节完成)

 enter the monitor:
    enter the method
    if the monitor is locked
        add this thread to e
        block this thread
    else
        lock the monitor

 leave the monitor:
    schedule
    return from the method

 wait  $c$  :
    add this thread to  $c$ .q
    schedule
    block this thread

 signal  $c$  :
    if there is a thread waiting on  $c$ .q
        select and remove one such thread t from  $c$ .q
        (t is called "the signaled thread")
        add this thread to s
        restart t
        (so t will occupy the monitor next)
        block this thread

  schedule :
    if there is a thread on s
      select and remove one thread from s and restart it
      (this thread will occupy the monitor next)
    else if there is a thread on e
      select and remove one thread from e and restart it
      (this thread will occupy the monitor next)
    else
      unlock the monitor
      (the monitor will become unoccupied)

schedule子程序选择下一个线程占用管程，如果没有候选的线程则解锁管程.

发出通知的线程转入等待，但会比在线程入口的队列有更高优先权被调度，这称为"通知且急迫等待"。另一种方案是"通知且等待"，不设s队列，发出通知的线程进入e队列等待.

某些实现提供了signal and return操作.

 signal  $c$  and return :
    if there is a thread waiting on  $c$ .q
        select and remove one such thread t from  $c$ .q
        (t is called "the signaled thread")
        restart t
        (so t will occupy the monitor next)
    else
        schedule
    return from the method

如果在每个signal $c$ 的开始处， $P_{c}$ 为真, 那么在wait $c$ 的结尾处 $P_{c}$ 也应为真。这可由契约式设计来表达. 在这些契约中, $I$ 是管程的不变量.

 enter the monitor:
    postcondition  $I$

 leave the monitor:
    precondition  $I$

 wait  $c$  :
    precondition  $I$ 
    modifies the state of the monitor
    postcondition  $P_{c}$  and  $I$

 signal  $c$  :
    precondition  $P_{c}$  and  $I$ 
    modifies the state of the monitor
    postcondition  $I$

 signal  $c$  and return :
    precondition  $P_{c}$  and  $I$

在上述契约中，设定 $I$ and $P_{c}$ 不依赖于任何队列长度.

如果可以查询条件变量所关联的队列上处于等待的线程的数量，可以使用更为复杂的契约。例如，一个有用的契约对，无需不变量就允许管程的占用被传递

 wait  $c$  :
    precondition  $I$ 
    modifies the state of the monitor
    postcondition  $P_{c}$

 signal  $c$ 
    precondition (not empty( $c$ ) and  $P_{c}$ ) or (empty( $c$ ) and  $I$ )
    modifies the state of the monitor
    postcondition  $I$

参见 Howard^[3] and Buhr et al.,^[4]有更多信息。

特别需要注意，断言 $P_{c}$ 完全是由编程者负责，编程者需要在头脑中保持对断言有一致的(consistent)定义。

下例是用阻塞式管程实现一个有界的、线程安全的栈. 即多线程并发访问这个栈时，在任意时刻最多只有一个线程执行push或pop操作。

monitor class SharedStack {
  private const capacity := 10
  private int[capacity] A
  private int size := 0
  invariant 0 <= size and size <= capacity
  private BlockingCondition theStackIsNotEmpty /* associated with 0 < size and size <= capacity */
  private BlockingCondition theStackIsNotFull /* associated with 0 <= size and size < capacity */

  public method push(int value)
  {
    if size = capacity then wait theStackIsNotFull
    assert 0 <= size and size < capacity
    A[size] := value ; size := size + 1
    assert 0 < size and size <= capacity
    signal theStackIsNotEmpty and return
  }

  public method int pop()
  {
    if size = 0 then wait theStackIsNotEmpty
    assert 0 < size and size <= capacity
    size := size - 1 ;
    assert 0 <= size and size < capacity
    signal theStackIsNotFull  and return A[size]
  }
}

非阻塞式条件变量

非阻塞式条件变量 (也称作"Mesa风格"条件变量或"通知且继续"(signal and continue)条件变量), 发出通知的线程并不会失去管程的占用权. 被通知的线程将会被移入管程入口的e队列. 不需要s队列。pthread中的条件变量就是这种非阻塞式：要先显式获得互斥加锁（pthread_mutex_lock），调用pthread_cond_wait时隐式对互斥锁解锁并进入阻塞睡眠，被唤醒后还要再显式获得互斥加锁。

Mesa风格的管程，有两个条件变量a与b

非阻塞式条件变量经常把signal操作称作notify — . 也常用notify all操作把该条件变量关联的队列上所有的线程移入e队列.

各种操作定义如下. (规定各种操作都是互斥的，线程被restart并不会立即执行，直到发起的操作完成)

 enter the monitor:
    enter the method
    if the monitor is locked
      add this thread to e
      block this thread
    else
      lock the monitor

 leave the monitor:
    schedule
    return from the method

 wait  $c$  :
    add this thread to  $c$ .q
    schedule
    block this thread

 notify  $c$  :
    if there is a thread waiting on  $c$ .q
        select and remove one thread t from  $c$ .q
        (t is called "the notified thread")
        move t to e

 notify all  $c$  :
    move all threads waiting on  $c$ .q to e

  schedule :
    if there is a thread on e
      select and remove one thread from e and restart it
    else
      unlock the monitor

一个变种实现，把被通知的(notified)线程移入队列w, 具有比e更高的优先级. 参见 Howard^[3] and Buhr et al.^[4]。

可以把断言 $P_{c}$ 关联于条件变量 $c$ ，因而 $P_{c}$ wait $c$ 返回时期望为真. 但是，这必须确保发出通知的线程结束到被通知的线程恢复执行这一时间段内， $P_{c}$ 保持为真. 这一时间段内可能会有其它线程占用过管程。因此通常必须把每个wait操作用一个循环包裹起来：

  while not(  $P$  ) do wait c

其中 $P$ 是一个条件，强于 $P_{c}$ . 操作notify $c$ 与notify all $c$ 被视为"提示"(hints)某个等待队列的 $P$ 可能为真. 上述循环的每一次迭代，表示失去了一次通知。

一个银行账户的例子，取款线程将等待资金已经完成注入账户之后再执行。

monitor class Account {
  private int balance := 0
  invariant balance >= 0
  private NonblockingCondition balanceMayBeBigEnough

  public method withdraw(int amount)
     precondition amount >= 0
  {
    while balance < amount do wait balanceMayBeBigEnough
    assert balance >= amount
    balance := balance - amount
  }

  public method deposit(int amount)
     precondition amount >= 0
  {
    balance := balance + amount
    notify all balanceMayBeBigEnough
  }
}

在这个例子中，被等待的条件是取款金额大于存款余额，存款线程不可能知道取款金额，因此存款线程发出的通知的意涵是提示所有等待中的取款线程检查它自身的断言是否为真。

隐式条件变量管程

Java风格管程

Java程序设计语言中，每个对象都可以作为一个管程。需要互斥使用的方法必须明确标示关键字synchronized。代码块也可以标示关键字synchronized。

不使用明确的条件变量, Java的这种管程在入口队列之外，使用单独的条件等待队列. 所有等待的线程进入这个队列，所有的notify与notify all操作也施加于这个队列。这种方法已经被其它程序设计语言使用，如C#。

隐式通知

另一种实现方法是忽略signal操作。当一个线程交出管程占用权(returning或者waiting)，所有处于等待状态的线程的断言都被检查，直到发现某个线程的断言为真。在这种系统中，不需要条件变量，但是断言必须明确编码。 wait操作的契约：

 wait  $P$ :
    precondition  $I$ 
    modifies the state of the monitor
    postcondition  $P$  and  $I$

Posix Thread中的条件变量

pthread中，条件变量实际上是一个阻塞线程处于睡眠状态的线程队列。每个条件变量都必须与一个（且建议只能是一个）互斥锁配套使用。一个线程首先获得互斥锁，然后检查或者修改“条件”；如果条件不成立，则调用pthread_cond_wait()，依次实施3个操作：

对当前互斥锁解锁（以便其它线程可以访问或者修改“条件”）
把线程自身阻塞挂起到互斥锁的线程队列中
被其它线程的信号从互斥锁的线程队列唤醒后，对当前配套的互斥锁申请加锁，如果加锁未能成功，则阻塞挂起到当前互斥锁上。pthread_cond_wait() 函数将不返回直到线程获得配套的互斥锁。

线程离开“条件”的临界区时，必须对当前互斥锁执行解锁。

Windows Thread中的条件变量

从Windows Vista开始，Windows Thread用critical section与条件变量配合，二者均为用户态同步对象，不能跨进程使用。使用API函数InitializeConditionVariable创建条件变量；函数SleepConditionVariableCS用于释放临界区并阻塞在条件变量上；函数WakeConditionVariable或 WakeAllConditionVariable唤醒挂在条件变量上的线程。

也可配套使用读写锁（Slim Reader/Writer (SRW) Locks）。

Spurious wakeup

假唤醒是POSIX Threads与Windows API使用条件变量时可能发生的复杂情形。一个挂在条件变量上的线程被signaled，正在等待的条件仍有可能不成立。假唤醒（Spurious wakeup）是指即使没有线程signaled该条件变量，挂在该条件变量上的线程却被唤醒。^[5]因此，应该用while循环包围条件变量等待操作：

/* In any waiting thread: */
while(!buf->full)
	wait(&buf->cond, &buf->lock);

/* In any other thread: */
if(buf->n >= buf->size){
	buf->full = 1;
	signal(&buf->cond);
}

stolen wakeups

被偷走的唤醒是POSIX Threads与Windows API使用条件变量时，线程调用g_cond_signal时，另一个线程已经获取了mutex使得期望的条件不再满足，因此被唤醒的线程面临着条件不成立。^[6]^[7]因此，应该用while循环包围条件变量等待操作.

历史

东尼·霍尔与泊·派克·漢森在1972年形成了管程的构思, 根据他们自己更早的想法与艾兹赫尔·戴克斯特拉的工作. ^[8] 泊·派克·漢森第一个实现了管程。东尼·霍尔发展了理论框架并证明了与信号量等价。

管程不久用于单任务操作系统的进程间通信.

已经支持管程的程序设计语言：

Ada (从 Ada 95 (作为protected objects) 开始)
C# (以及其它使用.NET Framework的程序设计语言)
C++ (从 C++11 开始，详见b:C++/STL/ConditionVariable)
Concurrent Euclid（英语：Concurrent Euclid）
Concurrent Pascal（英语：Concurrent Pascal）
D
Delphi (Delphi 2009及更高版本，使用TObject.Monitor)
Java (使用wait与notify methods)
Mesa
Modula-3
Python（通过threading.Condition^[9]对象）
Ruby
Squeak Smalltalk
Turing（英语：Turing (programming language)）, Turing+（英语：Turing+）与Object-Oriented Turing（英语：Object-Oriented Turing）
μC++（英语：μC++）

许多库已经允许在程序设计语言没有本地支持时构建管程。当库调用时，编程者负责明确表示互斥执行的代码块的开始与结尾. Pthreads就是这样一个库.

参考文献

^ Hoare, C. A. R. (1974), "Monitors: an operating system structuring concept". Comm. A.C.M. 17(10), 549–57. [1]
^ Brinch Hansen, P. (1975). "The programming language Concurrent Pascal". IEEE Trans. Softw. Eng. 2 (June), 199–206.
^ ^3.0 ^3.1 John Howard (1976), "Signaling in monitors". Proceedings of the 2nd International Conference on Software Engineering, 47–52
^ ^4.0 ^4.1 Buhr, P.H; Fortier, M., Coffin, M.H. (1995). "Monitor classification". ACM Computing Surveys (CSUR) 27(1). 63–107. [2]
^ David R. Butenhof《Programming with POSIX Threads》 ISBN 0-201-63392-2: "This means that when you wait on a condition variable, the wait may (occasionally) return when no thread specifically broadcast or signaled that condition variable. Spurious wakeups may sound strange, but on some multiprocessor systems, making condition wakeup completely predictable might substantially slow all condition variable operations. The race conditions that cause spurious wakeups should be considered rare."
^ MSDN:SleepConditionVariableCS function. [2017-02-16]. （原始内容存档于2017-02-16）.
^ GNOME DEVELOPER: Threads Functions. [2017-02-16]. （原始内容存档于2020-08-14）.
^ Brinch Hansen, P. (1993). "Monitors and concurrent Pascal: a personal history", The second ACM SIGPLAN conference on History of programming languages 1–35. Also published in ACM SIGPLAN Notices 28(3), March 1993. [3]
^ threading.Condition （页面存档备份，存于互联网档案馆）

外部链接

"Monitors: An Operating System Structuring Concept （页面存档备份，存于互联网档案馆）" by Charles Antony Richard Hoare
"Signalling in Monitors" by John H. Howard
"Experience with Processes and Monitors in Mesa （页面存档备份，存于互联网档案馆）" by Butler W. Lampson and David D. Redell
pthread_cond_wait （页面存档备份，存于互联网档案馆） - description from the Open Group Base Specifications Issue 6, IEEE Std 1003.1
"Block on a Condition Variable" by Dave Marshall
"Strategies for Implementing POSIX Condition Variables on Win32" by Douglas C. Schmidt and Irfan Pyarali
Condition Variable Routines （页面存档备份，存于互联网档案馆） from the Apache Portable Runtime Library
wxCondition description （页面存档备份，存于互联网档案馆）
boost::condition class description
ZThread Condition Class Reference （页面存档备份，存于互联网档案馆）
Wefts::Condition Class Reference （页面存档备份，存于互联网档案馆）
ACE_Condition Class Template Reference
QWaitCondition Class Reference
Common C++ Conditional Class Reference （页面存档备份，存于互联网档案馆）
at::ConditionalMutex Class Reference （页面存档备份，存于互联网档案馆）
threads::shared （页面存档备份，存于互联网档案馆） - Perl extension for sharing data structures between threads

[1] Hoare, C. A. R. (1974), "Monitors: an operating system structuring concept". Comm. A.C.M. 17(10), 549–57. [1]

[2] Brinch Hansen, P. (1975). "The programming language Concurrent Pascal". IEEE Trans. Softw. Eng. 2 (June), 199–206.

[Howard76-3] 3.0 ^3.1 John Howard (1976), "Signaling in monitors". Proceedings of the 2nd International Conference on Software Engineering, 47–52

[Buhr95-4] 4.0 ^4.1 Buhr, P.H; Fortier, M., Coffin, M.H. (1995). "Monitor classification". ACM Computing Surveys (CSUR) 27(1). 63–107. [2]

[5] David R. Butenhof《Programming with POSIX Threads》 ISBN 0-201-63392-2: "This means that when you wait on a condition variable, the wait may (occasionally) return when no thread specifically broadcast or signaled that condition variable. Spurious wakeups may sound strange, but on some multiprocessor systems, making condition wakeup completely predictable might substantially slow all condition variable operations. The race conditions that cause spurious wakeups should be considered rare."

[6] MSDN:SleepConditionVariableCS function. [2017-02-16]. （原始内容存档于2017-02-16）.

[7] GNOME DEVELOPER: Threads Functions. [2017-02-16]. （原始内容存档于2020-08-14）.

[8] Brinch Hansen, P. (1993). "Monitors and concurrent Pascal: a personal history", The second ACM SIGPLAN conference on History of programming languages 1–35. Also published in ACM SIGPLAN Notices 28(3), March 1993. [3]

[9] threading.Condition （页面存档备份，存于互联网档案馆）

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]