Polar factorization theorem

In optimal transport, a branch of mathematics, polar factorization of vector fields is a basic result due to Brenier (1987),¹ with antecedents of Knott-Smith (1984)² and Rachev (1985),³ that generalizes many existing results among which are the polar decomposition of real matrices, and the rearrangement of real-valued functions.

The theorem

Notation. Denote $\xi _{\#}\mu$ the image measure of $\mu$ through the map $\xi$ .

Definition: Measure preserving map. Let $(X,\mu )$ and $(Y,\nu )$ be some probability spaces and $\sigma :X\rightarrow Y$ a measurable map. Then, $\sigma$ is said to be measure preserving iff $\sigma _{\#}\mu =\nu$ , where $\#$ is the pushforward measure. Spelled out: for every $\nu$ -measurable subset $\Omega$ of $Y$ , $\sigma ^{-1}(\Omega )$ is $\mu$ -measurable, and $\mu (\sigma ^{-1}(\Omega ))=\nu (\Omega )$ . The latter is equivalent to:

\int _{X}(f\circ \sigma )(x)\mu (dx)=\int _{X}(\sigma ^{*}f)(x)\mu (dx)=\int _{Y}f(y)(\sigma _{\#}\mu )(dy)=\int _{Y}f(y)\nu (dy)

where $f$ is $\nu$ -integrable and $f\circ \sigma$ is $\mu$ -integrable.

Theorem. Consider a map $\xi :\Omega \rightarrow R^{d}$ where $\Omega$ is a convex subset of $R^{d}$ , and $\mu$ a measure on $\Omega$ which is absolutely continuous. Assume that $\xi _{\#}\mu$ is absolutely continuous. Then there is a convex function $\varphi :\Omega \rightarrow R$ and a map $\sigma :\Omega \rightarrow \Omega$ preserving $\mu$ such that

$\xi =\left(\nabla \varphi \right)\circ \sigma$

In addition, $\nabla \varphi$ and $\sigma$ are uniquely defined almost everywhere.¹⁴

Applications and connections

Dimension 1

In dimension 1, and when $\mu$ is the Lebesgue measure over the unit interval, the result specializes to Ryff's theorem.⁵ When $d=1$ and $\mu$ is the uniform distribution over $\left[0,1\right]$ , the polar decomposition boils down to

$\xi \left(t\right)=F_{X}^{-1}\left(\sigma \left(t\right)\right)$

where $F_{X}$ is cumulative distribution function of the random variable $\xi \left(U\right)$ and $U$ has a uniform distribution over $\left[0,1\right]$ . $F_{X}$ is assumed to be continuous, and $\sigma \left(t\right)=F_{X}\left(\xi \left(t\right)\right)$ preserves the Lebesgue measure on $\left[0,1\right]$ .

Polar decomposition of matrices

When $\xi$ is a linear map and $\mu$ is the Gaussian normal distribution, the result coincides with the polar decomposition of matrices. Assuming $\xi \left(x\right)=Mx$ where $M$ is an invertible $d\times d$ matrix and considering $\mu$ the ${\mathcal {N}}\left(0,I_{d}\right)$ probability measure, the polar decomposition boils down to

$M=SO$

where $S$ is a symmetric positive definite matrix, and $O$ an orthogonal matrix. The connection with the polar factorization is $\varphi \left(x\right)=x^{\top }Sx/2$ which is convex, and $\sigma \left(x\right)=Ox$ which preserves the ${\mathcal {N}}\left(0,I_{d}\right)$ measure.

Helmholtz decomposition

The results also allow to recover Helmholtz decomposition. Letting $x\rightarrow V\left(x\right)$ be a smooth vector field it can then be written in a unique way as

$V=w+\nabla p$

where $p$ is a smooth real function defined on $\Omega$ , unique up to an additive constant, and $w$ is a smooth divergence free vector field, parallel to the boundary of $\Omega$ .

The connection can be seen by assuming $\mu$ is the Lebesgue measure on a compact set $\Omega \subset R^{n}$ and by writing $\xi$ as a perturbation of the identity map

$\xi _{\epsilon }(x)=x+\epsilon V(x)$

where $\epsilon$ is small. The polar decomposition of $\xi _{\epsilon }$ is given by $\xi _{\epsilon }=(\nabla \varphi _{\epsilon })\circ \sigma _{\epsilon }$ . Then, for any test function $f:R^{n}\rightarrow R$ the following holds:

$\int _{\Omega }f(x+\epsilon V(x))dx=\int _{\Omega }f((\nabla \varphi _{\epsilon })\circ \sigma _{\epsilon }\left(x\right))dx=\int _{\Omega }f(\nabla \varphi _{\epsilon }\left(x\right))dx$

where the fact that $\sigma _{\epsilon }$ was preserving the Lebesgue measure was used in the second equality.

In fact, as $\textstyle \varphi _{0}(x)={\frac {1}{2}}\Vert x\Vert ^{2}$ , one can expand $\textstyle \varphi _{\epsilon }(x)={\frac {1}{2}}\Vert x\Vert ^{2}+\epsilon p(x)+O(\epsilon ^{2})$ , and therefore $\textstyle \nabla \varphi _{\epsilon }\left(x\right)=x+\epsilon \nabla p(x)+O(\epsilon ^{2})$ . As a result, $\textstyle \int _{\Omega }\left(V(x)-\nabla p(x)\right)\nabla f(x))dx$ for any smooth function $f$ , which implies that $w\left(x\right)=V(x)-\nabla p(x)$ is divergence-free.¹⁶

References

Brenier, Yann (1991). "Polar factorization and monotone rearrangement of vector‐valued functions" (PDF). Communications on Pure and Applied Mathematics. 44 (4): 375–417. doi:10.1002/cpa.3160440402. Retrieved 16 April 2021.
Knott, M.; Smith, C. S. (1984). "On the optimal mapping of distributions". Journal of Optimization Theory and Applications. 43: 39–49. doi:10.1007/BF00934745. S2CID 120208956. Retrieved 16 April 2021.
Rachev, Svetlozar T. (1985). "The Monge–Kantorovich mass transference problem and its stochastic applications" (PDF). Theory of Probability & Its Applications. 29 (4): 647–676. doi:10.1137/1129093. Retrieved 16 April 2021.
Santambrogio, Filippo (2015). Optimal transport for applied mathematicians. New York: Birkäuser. CiteSeerX 10.1.1.726.35.
Ryff, John V. (1965). "Orbits of L1-Functions Under Doubly Stochastic Transformation". Transactions of the American Mathematical Society. 117: 92–100. doi:10.2307/1994198. JSTOR 1994198. Retrieved 16 April 2021.
Villani, Cédric (2003). Topics in optimal transportation. American Mathematical Society.

[brenier-1] Brenier, Yann (1991). "Polar factorization and monotone rearrangement of vector‐valued functions" (PDF). Communications on Pure and Applied Mathematics. 44 (4): 375–417. doi:10.1002/cpa.3160440402. Retrieved 16 April 2021.

[2] Knott, M.; Smith, C. S. (1984). "On the optimal mapping of distributions". Journal of Optimization Theory and Applications. 43: 39–49. doi:10.1007/BF00934745. S2CID 120208956. Retrieved 16 April 2021.

[3] Rachev, Svetlozar T. (1985). "The Monge–Kantorovich mass transference problem and its stochastic applications" (PDF). Theory of Probability & Its Applications. 29 (4): 647–676. doi:10.1137/1129093. Retrieved 16 April 2021.

[Santambrogio-4] Santambrogio, Filippo (2015). Optimal transport for applied mathematicians. New York: Birkäuser. CiteSeerX 10.1.1.726.35.

[5] Ryff, John V. (1965). "Orbits of L1-Functions Under Doubly Stochastic Transformation". Transactions of the American Mathematical Society. 117: 92–100. doi:10.2307/1994198. JSTOR 1994198. Retrieved 16 April 2021.

[Villani-6] Villani, Cédric (2003). Topics in optimal transportation. American Mathematical Society.

1

2

3

4

5

6