# 梯度

## 定義

${\displaystyle {\big (}\nabla f(x){\big )}\cdot \mathbf {v} =D_{\mathbf {v} }f(x)}$

### 直角坐標系

${\displaystyle \nabla f}$ 在三维直角坐标系中表示为

${\displaystyle \nabla f={\begin{pmatrix}{\frac {\partial f}{\partial x}},{\frac {\partial f}{\partial y}},{\frac {\partial f}{\partial z}}\end{pmatrix}}={\frac {\partial f}{\partial x}}\mathbf {i} +{\frac {\partial f}{\partial y}}\mathbf {j} +{\frac {\partial f}{\partial z}}\mathbf {k} }$

i, j, k 為標準的單位向量，分別指向 x, y 跟 z 座標的方向。 （参看偏导数向量

${\displaystyle \nabla f={\begin{pmatrix}{2},{6y},{-\cos(z)}\end{pmatrix}}=2\mathbf {i} +6y\mathbf {j} -\cos(z)\mathbf {k} }$

### 圓柱坐標系

${\displaystyle \nabla f(\rho ,\varphi ,z)={\frac {\partial f}{\partial \rho }}\mathbf {e} _{\rho }+{\frac {1}{\rho }}{\frac {\partial f}{\partial \varphi }}\mathbf {e} _{\varphi }+{\frac {\partial f}{\partial z}}\mathbf {e} _{z}}$

ρ 是 P 點與 z-軸的垂直距離。 φ 是線 OP 在 xy-面的投影線與正 x-軸之間的夾角。 z直角坐標${\displaystyle z}$  等值。 eρ, eφ 跟 ez 為單位向量，指向座標的方向。

### 球坐標系

${\displaystyle \nabla f(r,\theta ,\varphi )={\frac {\partial f}{\partial r}}\mathbf {e} _{r}+{\frac {1}{r}}{\frac {\partial f}{\partial \theta }}\mathbf {e} _{\theta }+{\frac {1}{r\sin \theta }}{\frac {\partial f}{\partial \varphi }}\mathbf {e} _{\varphi }}$

## 实值函数相对于向量和矩阵的梯度

${\displaystyle \nabla _{\boldsymbol {x}}{\overset {\underset {\mathrm {def} }{}}{=}}\left[{\frac {\partial }{\partial x_{1}}},{\frac {\partial }{\partial x_{2}}},\cdots ,{\frac {\partial }{\partial x_{n}}}\right]^{T}={\frac {\partial }{\partial {\boldsymbol {x}}}}}$

### 对向量的梯度

${\displaystyle \nabla _{\boldsymbol {x}}f({\boldsymbol {x}}){\overset {\underset {\mathrm {def} }{}}{=}}\left[{\frac {\partial f({\boldsymbol {x}})}{\partial x_{1}}},{\frac {\partial f({\boldsymbol {x}})}{\partial x_{2}}},\cdots ,{\frac {\partial f({\boldsymbol {x}})}{\partial x_{n}}}\right]^{T}={\frac {\partial f({\boldsymbol {x}})}{\partial {\boldsymbol {x}}}}}$

m维行向量函数${\displaystyle {\boldsymbol {f}}({\boldsymbol {x}})=[f_{1}({\boldsymbol {x}}),f_{2}({\boldsymbol {x}}),\cdots ,f_{m}({\boldsymbol {x}})]}$ 相对于n维实向量x的梯度为一n×m矩阵，定义为

${\displaystyle \nabla _{\boldsymbol {x}}{\boldsymbol {f}}({\boldsymbol {x}}){\overset {\underset {\mathrm {def} }{}}{=}}{\begin{bmatrix}{\frac {\partial f_{1}({\boldsymbol {x}})}{\partial x_{1}}}&{\frac {\partial f_{2}({\boldsymbol {x}})}{\partial x_{1}}}&\cdots &{\frac {\partial f_{m}({\boldsymbol {x}})}{\partial x_{1}}}\\{\frac {\partial f_{1}({\boldsymbol {x}})}{\partial x_{2}}}&{\frac {\partial f_{2}({\boldsymbol {x}})}{\partial x_{2}}}&\cdots &{\frac {\partial f_{m}({\boldsymbol {x}})}{\partial x_{2}}}\\\vdots &\vdots &\ddots &\vdots \\{\frac {\partial f_{1}({\boldsymbol {x}})}{\partial x_{n}}}&{\frac {\partial f_{2}({\boldsymbol {x}})}{\partial x_{n}}}&\cdots &{\frac {\partial f_{m}({\boldsymbol {x}})}{\partial x_{n}}}\\\end{bmatrix}}={\frac {\partial {\boldsymbol {f}}({\boldsymbol {x}})}{\partial {\boldsymbol {x}}}}}$

### 对矩阵的梯度

${\displaystyle \nabla _{\boldsymbol {A}}f({\boldsymbol {A}}){\overset {\underset {\mathrm {def} }{}}{=}}{\begin{bmatrix}{\frac {\partial f({\boldsymbol {A}})}{\partial a_{11}}}&{\frac {\partial f({\boldsymbol {A}})}{\partial a_{12}}}&\cdots &{\frac {\partial f({\boldsymbol {A}})}{\partial a_{1n}}}\\{\frac {\partial f({\boldsymbol {A}})}{\partial a_{21}}}&{\frac {\partial f({\boldsymbol {A}})}{\partial a_{22}}}&\cdots &{\frac {\partial f({\boldsymbol {A}})}{\partial a_{2n}}}\\\vdots &\vdots &\ddots &\vdots \\{\frac {\partial f({\boldsymbol {A}})}{\partial a_{m1}}}&{\frac {\partial f({\boldsymbol {A}})}{\partial a_{m2}}}&\cdots &{\frac {\partial f({\boldsymbol {A}})}{\partial a_{mn}}}\\\end{bmatrix}}={\frac {\partial f({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}}$

### 法则

• 线性法则：若${\displaystyle f({\boldsymbol {A}})}$ ${\displaystyle g({\boldsymbol {A}})}$ 分别是矩阵A的实标量函数，c1和c2为实常数，则
${\displaystyle {\frac {\partial [c_{1}f({\boldsymbol {A}})+c_{2}g({\boldsymbol {A}})]}{\partial {\boldsymbol {A}}}}=c_{1}{\frac {\partial f({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}+c_{2}{\frac {\partial g({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}}$
• 乘积法则：若${\displaystyle f({\boldsymbol {A}})}$ ${\displaystyle g({\boldsymbol {A}})}$ ${\displaystyle h({\boldsymbol {A}})}$ 分别是矩阵A的实标量函数，则
${\displaystyle {\frac {\partial f({\boldsymbol {A}})g({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}=g({\boldsymbol {A}}){\frac {\partial f({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}+f({\boldsymbol {A}}){\frac {\partial g({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}}$
${\displaystyle {\frac {\partial f({\boldsymbol {A}})g({\boldsymbol {A}})h({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}=g({\boldsymbol {A}})h({\boldsymbol {A}}){\frac {\partial f({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}+f({\boldsymbol {A}})h({\boldsymbol {A}}){\frac {\partial g({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}+f({\boldsymbol {A}})g({\boldsymbol {A}}){\frac {\partial h({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}}$
• 商法则：若${\displaystyle g({\boldsymbol {A}})\neq 0}$ ，则
${\displaystyle {\frac {\partial f({\boldsymbol {A}})/g({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}={\frac {1}{g({\boldsymbol {A}})^{2}}}\left[g({\boldsymbol {A}}){\frac {\partial f({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}-f({\boldsymbol {A}}){\frac {\partial g({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}\right]}$
• 链式法则：若A为m×n矩阵，且${\displaystyle y=f({\boldsymbol {A}})}$ ${\displaystyle g(y)}$ 分别是以矩阵A和标量y为变元的实标量函数，则
${\displaystyle {\frac {\partial g(f({\boldsymbol {A}}))}{\partial {\boldsymbol {A}}}}={\frac {dg(y)}{dy}}{\frac {\partial f({\boldsymbol {A}})}{\partial {\boldsymbol {A}}}}}$

## 流形上的梯度

${\displaystyle \langle \nabla f,\xi \rangle :=\xi f}$

${\displaystyle \xi (f\mid _{p}):=\sum _{j}a_{j}({\frac {\partial }{\partial x_{j}}}(f\circ \varphi ^{-1})\mid _{\varphi (p)})}$

${\displaystyle \nabla f=\sum _{ik}g^{ik}{\frac {\partial f}{\partial x^{k}}}{\frac {\partial }{\partial x^{i}}}}$

## 参考文献

### 引用

1. ^ Beauregard & Fraleigh (1973, p. 84)
2. ^ Bachman (2007, p. 76)
Beauregard & Fraleigh (1973, p. 84)
Downing (2010, p. 316)
Harper (1976, p. 15)
Kreyszig (1972, p. 307)
McGraw-Hill (2007, p. 196)
Moise (1967, p. 683)
Protter & Morrey, Jr. (1970, p. 714)
Swokowski et al. (1994, p. 1038)
3. ^ Protter & Morrey, Jr. (1970, pp. 21,88)
4. ^ Bachman (2007, p. 77)
Downing (2010, pp. 316–317)
Kreyszig (1972, p. 309)
McGraw-Hill (2007, p. 196)
Moise (1967, p. 684)
Protter & Morrey, Jr. (1970, p. 715)
Swokowski et al. (1994, pp. 1036,1038–1039)
5. ^ Kreyszig (1972, pp. 308–309)
Stoker (1969, p. 292)
6. ^ Beauregard & Fraleigh (1973, pp. 87,248)
Kreyszig (1972, pp. 333,353,496)
7. ^ Schey 1992，第139–142頁.
8. ^ 张贤达 (2004, p. 258)