约翰逊-林登斯特劳斯定理

约翰逊-林登斯特劳斯定理(Johnson–Lindenstrauss theorem)，又称约翰逊-林登斯特劳斯引理(Johnson–Lindenstrauss lemma)，是由William Johnson（英语：William_B._Johnson_(mathematician)）和Joram Lindenstrauss（英语：Joram_Lindenstrauss）于1984年提出的一个关于降维的著名定理^[1]，在现代机器学习，尤其是压缩感知、降维、形状分析（英语：Nonlinear_dimensionality_reduction#Manifold_learning_algorithms）和分布学习（英语：Density_estimation）等领域中有很重要的应用^[2]^[3]^[4]^[5]。

这个定理告诉我们，一个高维空间中的点集，可以被线性地镶嵌到低维空间中，同时其空间结构只遭受比较小的形变^[6]。约翰逊-林登斯特劳斯定理的证明，还说明了如何用随机投影法（英语：Random_projection）明确地求出这个变换，所用的算法只需要随机多项式时间^[7]。当然，降维不是免费的：如果要求形变很少的话，代价（英语：trade-off）是被嵌入的低维空间维数不能很低；反之亦然，如果要求将点集嵌入很低维的空间，那么就不能很好地控制结构形变的程度。

因为能将维数下降到样本量的对数阶，更兼所用的变换是线性的、显式的还可以被快速计算，约翰逊-林登斯特劳斯定理是一个力度非常强的结论。

定理陈述

对任何给定的 $\epsilon \in (0,1)$ 以及 $N$ 维欧几里德空间中的 $m$ 个点 $\{x_{1},\ldots ,x_{m}\}$ ，对于任意满足条件 $n>(\epsilon ^{2}/2-\epsilon ^{3}/3)^{-1}\log m$ 的正整数 $n$ ，存在一个线性映射 $f:\mathbb {R} ^{N}\to \mathbb {R} ^{n}$ ，将这 $m$ 个点，从 $\mathbb {R} ^{N}$ (维数可能很高的空间)中映射到 $\mathbb {R} ^{n}$ (低维空间)中，同时“基本上”保持了点集成员两两之间的距离，即：对于任意两个点 $(x_{i},x_{j}):1\leq i<j\leq m$ ，都有

(1-\epsilon )\|x_{i}-x_{j}\|_{2}^{2}\leq \|f(x_{i})-f(x_{j})\|_{2}^{2}\leq (1+\epsilon )\|x_{i}-x_{j}\|_{2}^{2}

更进一步地，这个线性映射 $f$ 还可以在随机多项式时间内求出^[7]。

直观理解

约翰逊-林登斯特劳斯定理揭示了一些关于降维映射深刻事实，其中一些还是违反简单直觉的^[8]。因此，要想直观地理解这个定理，对初学者来说，可能比从数学式子上读懂证明还要难(反而此定理的证明只用到了比较简单的关于投影的随机误差不等式（英语：Concentration of measure）^[9])。举例来说，定理的结论表明，度量形变程度的误差参数 $\epsilon$ 以及低维空间的维数 $n$ 这两个度量降维水准的关键量，均与原始数据所在的空间维数 $N$ 或者原始的 $m$ 个点具体为何种空间结构无关，这是很不平凡的^[9]。

最优性

约翰逊-林登斯特劳斯定理是不能被本质性地改进的^[10]，即：给定任意正整数 $m$ 和误差参数 $\epsilon$ ，存在某个 $N$ 以及 $\mathbb {R} ^{N}$ 中的 $m$ 个点，这个点集“难以降维”——也就是说，对任何一个满足“基本保持点距”要求(即： $(1-\epsilon )\|x_{i}-x_{j}\|_{2}^{2}\leq \|f(x_{i})-f(x_{j})\|_{2}^{2}\leq (1+\epsilon )\|x_{i}-x_{j}\|_{2}^{2}$ 要对任意 $(i,j):1\leq i<j\leq n$ 成立)的线性映射 $f:\mathbb {R} ^{N}\to \mathbb {R} ^{n}$ ，它用来镶嵌高维数据的那个低维空间(即 $\mathbb {R} ^{n}$ )，至少必须具有

n=\Omega \left({\frac {\log m}{\epsilon ^{2}}}\right)

这么多的维数^[11]。

证明提要

定理可以用高年级本科生容易理解的方法证明^[7]^[12]，其思路是证明如下事实：多次独立地重复进行随机投影的试验，每次试验中随机抽取的投影 $f_{0}:\mathbb {R} ^{N}\to \mathbb {R} ^{n}$ 都有至少 $1/n$ 的概率符合定理中对映射 $f$ 全部的要求(显然，验证任何一个 $f_{0}$ 是否符合这些要求只需 $O(n^{2})$ 时间)，因此只要重复该独立实验 $\omega (n)$ 次就能以逼近100%的概率产生至少一个符合要求的映射 $f$ 。

参考文献

^ Johnson, William B; Lindenstrauss, Joram. Extensions of Lipschitz mappings into a Hilbert space. Contemporary mathematics. 1984, 26 (1): 189-206.
^ Devdatt Dubhashi. Johnson-Lindenstrauss, Concentration and applications to Support Vector Machines and Kernels (PDF). simons.berkeley.edu. [2018-11-13]. （原始内容存档 (PDF)于2020-10-27）.
^ Suresh Venkat. When to use the Johnson-Lindenstrauss lemma over SVD?. cstheory.stackexchange.com. [2018-11-13]. （原始内容存档于2020-11-25）.
^ Krahmer, Felix; Ward, Rachel. New and improved Johnson--Lindenstrauss embeddings via the restricted isometry property. SIAM Journal on Mathematical Analysis. 2011, 43 (3): 1269--1281.
^ Akselrod-Ballin, Ayelet; Bock, Davi and Reid, R Clay and Warfield, Simon K. Accelerating image registration with the Johnson--Lindenstrauss lemma: application to imaging 3-D neural ultrastructure with electron microscopy. IEEE transactions on medical imaging. 2011, 30 (7): 1427--1438.
^ Paul Beame. CSE 522: Sublinear (and Streaming) Algorithms Spring 2014 Lecture 10: Johnson-Lindenstrauss Lemmas (PDF). courses.cs.washington.edu. [2018-11-13]. （原始内容存档 (PDF)于2017-04-01）.
^ ^7.0 ^7.1 ^7.2 Dasgupta, Sanjoy; Gupta, Anupam. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures & Algorithms. 2003, 22 (1): 60--65.
^ Sariel Har-Peled. The Johnson-Lindenstrauss Lemma (PDF). sarielhp.org. [2018-11-13].
^ ^9.0 ^9.1 Roman Vershynin. High-dimensional probability: An introduction with applications in data science. Cambridge University Press. 2018-08-01 [2018-11-13]. ISBN 9781108415194.
^ Alon, Noga. Problems and results in extremal combinatorics—I. Discrete Mathematics. 2003, 273 (1-3): 31--53.
^ Larsen, Kasper Green; Nelson, Jelani. Optimality of the johnson-lindenstrauss lemma. 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS). IEEE: 633––638. 2017.
^ Michael Mahoney. CS369M: Algorithms for Modern Massive Data Set Analysis Lecture 1: The Johnson-Lindenstrauss Lemma (PDF). cs.stanford.edu. [2018-11-13]. （原始内容存档 (PDF)于2015-12-13）.

[JL_paper-1] Johnson, William B; Lindenstrauss, Joram. Extensions of Lipschitz mappings into a Hilbert space. Contemporary mathematics. 1984, 26 (1): 189-206.

[slides_JL-SVM-Kernels-2] Devdatt Dubhashi. Johnson-Lindenstrauss, Concentration and applications to Support Vector Machines and Kernels (PDF). simons.berkeley.edu. [2018-11-13]. （原始内容存档 (PDF)于2020-10-27）.

[JL-application_1-3] Suresh Venkat. When to use the Johnson-Lindenstrauss lemma over SVD?. cstheory.stackexchange.com. [2018-11-13]. （原始内容存档于2020-11-25）.

[JL-application_2-4] Krahmer, Felix; Ward, Rachel. New and improved Johnson--Lindenstrauss embeddings via the restricted isometry property. SIAM Journal on Mathematical Analysis. 2011, 43 (3): 1269--1281.

[JL-application_3-5] Akselrod-Ballin, Ayelet; Bock, Davi and Reid, R Clay and Warfield, Simon K. Accelerating image registration with the Johnson--Lindenstrauss lemma: application to imaging 3-D neural ultrastructure with electron microscopy. IEEE transactions on medical imaging. 2011, 30 (7): 1427--1438.

[slides_JL-intro-6] Paul Beame. CSE 522: Sublinear (and Streaming) Algorithms Spring 2014 Lecture 10: Johnson-Lindenstrauss Lemmas (PDF). courses.cs.washington.edu. [2018-11-13]. （原始内容存档 (PDF)于2017-04-01）.

[JL_proof-7] 7.0 ^7.1 ^7.2 Dasgupta, Sanjoy; Gupta, Anupam. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures & Algorithms. 2003, 22 (1): 60--65.

[JL-understanding-8] Sariel Har-Peled. The Johnson-Lindenstrauss Lemma (PDF). sarielhp.org. [2018-11-13].

[RV-book-9] 9.0 ^9.1 Roman Vershynin. High-dimensional probability: An introduction with applications in data science. Cambridge University Press. 2018-08-01 [2018-11-13]. ISBN 9781108415194.

[Alon_2003-10] Alon, Noga. Problems and results in extremal combinatorics—I. Discrete Mathematics. 2003, 273 (1-3): 31--53.

[Optimality-2017-11] Larsen, Kasper Green; Nelson, Jelani. Optimality of the johnson-lindenstrauss lemma. 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS). IEEE: 633––638. 2017.

[JL_proof2-12] Michael Mahoney. CS369M: Algorithms for Modern Massive Data Set Analysis Lecture 1: The Johnson-Lindenstrauss Lemma (PDF). cs.stanford.edu. [2018-11-13]. （原始内容存档 (PDF)于2015-12-13）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]