**Proof of the Sampling Method for Visible GGX**
Pål-Kristian Engstad
$$
\newcommand{\a}[1]{\mathbf{#1}}
\newcommand{\d}[1]{\mathbf{\hat#1}}
\newcommand{\vd}[1]{\mathrm{d}{#1}}
\newcommand{\v}[1]{\mathbf{\underline{#1}}}
\newcommand{\l}[1]{\left\|#1\right\|}
\DeclareMathOperator\arctanh{arctanh}
\newcommand{\dp}[2]{#1\bullet #2}
\newcommand{\cp}[2]{\langle\dp{#1}{#2}\rangle}
\newcommand{\ip}[2]{(\dp{#1}{#2})}
\newcommand{\pp}[2]{\left(\dp{#1}{#2}\right)}
\newcommand{\det}[3]{\mathsf{det}\left(#1,\,#2,\,#3\right)}
\newcommand{\abs}[1]{\left|#1\right|}
\newcommand{\jac}[2]{\l{\frac{\partial\;\!{#1}}{\partial\;\!{#2}}}}
$$
# Introduction
In a previous blog [post](./elliptic-stretch.html), we focused on proving and explaining the math behind sampling the GGX distribution. In this
post, we will look at the **Visible** GGX distribution, championed by Walter et al. supplementary material [#Walter15] and
further explained by showing source-code in Heitz's work [#Heitz18].
Unfortunately, neither work proves mathematically that the sampling procedure they propose results in the probability density
function that is needed. In this post, we correct this by doing exactly that.
## vGGX Sampling
While standard GGX sampling has the probability density function $p_\mathsf{GGX}: \mathbb{S}_2 \rightarrow \mathbb{R}_+$, given by
$$p_\mathsf{GGX}(\d{m}) = \begin{cases}D(\d{m}) \cdot (\dp{\d{m}}{\d{n}}),\quad & \text{when}\,\dp{\d{m}}{\d{n}} > 0,\\
0,\quad & \text{otherwise}\end{cases}$$
the goal of **Visible** GGX sampling is to sample more in the direction of $\d{v}$ (using the dot-product be with $\d{v}$), as well as
take into account the factor $G_1(\d{v})$,
$$ p_\mathsf{vGGX}(\d{m}) =
\begin{cases}
D(\d{m}) \cdot (\dp{\d{m}}{\d{v}}) \cdot G_1(\d{v})\; / \; \ip{\d{v}}{\d{n}}, \quad & \text{when}\,\dp{\d{m}}{\d{v}} > 0\,\text{and}\,\dp{\d{v}}{\d{n}} > 0,\\
0, &\text{otherwise}\end{cases}$$
where
$$ G_1(\d{v}) = \frac{1}{1+\Lambda(\d{v})},$$
and
$$ \Lambda(\d{v}) = -\frac{1}{2} + \frac{1}{2}\sqrt{\frac{\alpha_x^2\hat{v}_x^2+\alpha_y^2\hat{v}_y^2}{\hat{v}_z^2} + 1},$$
since this will more closely resemble the specular microfacet BRDF
$$ f_r(\d{v}, \d{\omega}) = \frac{F(\d{h}) \; G(\d{v}, \d{\omega}) \; D(\d{h})}{4\ip{\d{v}}{\d{n}}\ip{\d{\omega}}{\d{n}}},$$
defined when $\dp{\d{v}}{\d{n}} > 0$ and $\dp{\d{\omega}}{\d{n}} > 0$, otherwise $0$, and where
$$ \d{h} = \mathsf{normalize}(\d{v}+\d{\omega}),$$
is the half-vector between the view $\d{v}$ and the incident light direction $\d{\omega}$, and
$$ G(\d{v}, \d{\omega}) = \frac{1}{1+\Lambda(\d{v})+\Lambda(\d{\omega})} \approx \frac{1}{1+\Lambda(\d{v})}\cdot\frac{1}{1+\Lambda(\d{\omega})} = G_1(\d{v})\;G_1(\d{\omega}),$$
and $F(\d{h})$ is the Fresnel term.
It is important to note that the probability distribution function $p_\mathsf{vGGX}$ has different support (area with non-zero value) than $p_\mathsf{GGX}$.
The latter can produce samples with $\dp{\d{v}}{\d{m}} < 0$, which would result in an incident direction
$$ \d{\omega} = \mathsf{reflect}(\d{m}, \d{v}) = 2 \ip{\d{v}}{\d{m}} \d{m} - \d{v}.$$
in the opposite hemisphere relative to the geometry normal $\d{n}$, and hence discarded.
However, do note that even though the vGGX-sampling method fares better in this regard, it can *still* produce samples that need to be rejected.
Therefore, it is vital to check that
$$ \dp{\d{\omega}}{\d{n}} > 0. $$
## Use of the diagonal $\a{A}$ matrix
Walter et al.'s work uses a general $3\times 3$-matrix to represent the shape of the distribution. For our purposes, we work
on the anisotropic GGX distribution, in other words
$$ \a{A} = \mathsf{diag}(\alpha_x, \alpha_y, 1) = \begin{bmatrix}\alpha_x & 0 & 0 \\ 0 & \alpha_y & 0 \\ 0 & 0 & 1\end{bmatrix}.$$
It is noteworthy to see that the transform
$$ \d{m} = \frac{\a{A}\d{s}}{\l{\a{A}\d{s}}}, $$
corresponds to the Elliptic Stretch, since in expanded form this reads
$$ \d{m} = \frac{1}{\sqrt{\alpha_x^2\hat{s}_x^2+\alpha_y^2\hat{s}_y^2+\hat{s}_z^2}}
\begin{bmatrix}\alpha_x\hat{s}_x\\\alpha_y\hat{s}_y\\\hat{s}_z\end{bmatrix} =
\mathsf{normalize}\begin{bmatrix}\alpha_x\hat{s}_x\\\alpha_y\hat{s}_y\\\hat{s}_z\end{bmatrix}
. $$
We also [showed](./ellipsoid-map.html) that Elliptic Stretch corresponds to using Ellipsoid Map followed by
calculating the normal. Walter et al. make use of the Ellipsoid Map, but we do not need it.
# The vGGX Sampling Method
The procedure is as follows, see [#Heitz18] for details:
1. Make an $(\d{x}, \d{y}, \d{z})$ frame based on $\d{v}$.
2. Sample a point uniformly on a 2D disk.
3. Squish the disk using $\d{z}$.
4. Project the point on the disk in the direction of $\d{z}$ onto the hemisphere.
5. Perform Elliptic Stretch on the point on the hemisphere.
We can visualize these transformations as
$$ \a{u} \xrightarrow{\textsf{disk}} \a{r} \xrightarrow{\textsf{squish}(\d{v})} \a{p} \xrightarrow{\textsf{project}} \d{s} \xrightarrow{\textsf{stretch}} \d{m}. $$
For each of the steps, we will calculate the Jacobian Determinant, and in the end, combine all of them to calculate the probability density function.
## Make an $(\d{x}, \d{y}, \d{z})$ frame based on $\d{v}$.
The idea is to sample cosine-weighted points on a hemisphere that is pointed in a direction $\d{z}$, instead of $\d{n}$ in standard GGX sampling,
by projecting points on a disk in the $(\d{x}, \d{y})$ plane onto the hemisphere. The distribution of micro-facet normals is still intrinsically
directed in the direction $\d{n}$, but we want to skew the sampling points towards the view-direction, thus we transform it using Elliptic Stretch:
$$ \d{z} = \mathsf{normalize}\begin{bmatrix}\alpha_x \hat{v}_x \\ \alpha_y \hat{v}_y \\ \hat{v}_z\end{bmatrix} = \frac{\a{A}\d{v}}{\l{\a{A}\d{v}}}, $$
then create an orthonormal basis:
$$ \d{x} = \frac{1}{\sqrt{\hat{z}_x^2 + \hat{z}_y^2}}\begin{bmatrix}-\hat{z}_y\\\hat{z}_x\\0\end{bmatrix},$$
$$ \d{y} = \d{z}\times\d{x}.$$
This gives us an orthonormal frame, such that
$$ \d{x} \perp \d{y}, \d{x} \perp \d{z}, \d{y} \perp \d{z},$$
and
$$ \l{\d{x}} = \l{\d{y}} = \l{\d{z}} = 1.$$
Note: If $\hat{z}_x = \hat{z}_y = 0$, then $\hat{z}_z = \pm 1$, thus we can choose $\d{x} = \left[1, 0, 0\right]^\top.$
## Sample a point uniformly on a 2D disk
Uniform 2D disk sampling can be done by uniformly sampling the unit square, and then either map
$$ \a{u} = (u_1, u_2) \in \left[0, 1\right\rangle \times \left[0, 1\right\rangle $$
into the unit disk $\mathbb{D}$ using either
$$ \a{r} = (s, t) = \mathsf{polar}\left(\sqrt{u_1}, 2\pi u_2\right), $$
where
$$ \mathsf{polar}(r, \theta) = (r\cos\theta, r\sin\theta),$$
or by using **concentric mapping** via
$$ a = 2u_1 - 1, b = 2u_2 - 1, $$
$$ \a{r} = (s, t) = \begin{cases}
\mathsf{polar}\left(a, \frac{\pi b}{4 a}\right), & \text{if}\, \abs{a} > \abs{b} \\
\mathsf{polar}\left(b, \frac{\pi a}{4 b}\right), & \text{if}\, \abs{a} < \abs{b} \\
(a, b), & \text{otherwise.}
\end{cases}$$
In either case, the Jacobian Determinant is
$$ \jac{\a{r}}{\a{u}} = \l{\frac{\partial(s, t)}{\partial(u_1, u_2)}} = \pi.$$
## Squish the disk using $\d{z}$
Given $\a{r} = (s, t)$ on the unit disk, we map
$$ (s, t) \mapsto \a{p} = (x, y) = \left(s, \mathsf{lerp}\left(t^{\star}, t, \zeta\right)\right),$$
where
$$ \zeta = \frac{1+\hat{z}_z}{2},$$
$$ t^{\star} = \sqrt{1 - s^2},$$
and
$$ \mathsf{lerp}(a, b, \zeta) = (1-\zeta)a + \zeta b.$$
This map squishes the $y$-value linearly w.r.t. $\hat{z}_z$, in such a way that when $\hat{z}_z = 0$, $y$'s range is $\left\langle 0, \sqrt{1-x^2}\right\rangle$,
(i.e., a half-disk), whereas when $\hat{z}_z = 1$, $y$'s range is $\left\langle-\sqrt{1-x^2}, \sqrt{1-x^2}\right\rangle$ (i.e., a complete disk).
Now,
$$
\frac{\partial t^\star}{\partial s} = -\frac{s}{\sqrt{1-s^2}}, \qquad \frac{\partial t^\star}{\partial t} = 0, $$
$$ \frac{\partial t}{\partial s} = 0, \qquad \frac{\partial t}{\partial t} = 1, $$
so since $\zeta$ is independent of $\a{r} = (s, t)$,
$$
\frac{\partial y}{\partial s} =
\frac{\partial}{\partial s} \mathsf{lerp}\left(t^\star, t, \zeta\right) =
\mathsf{lerp}\left(
\frac{\partial t^{\star}}{\partial s},
\frac{\partial t}{\partial s},
\zeta
\right) =
- \frac{s(1-\zeta)}{\sqrt{1-s^2}}
$$
and
$$
\frac{\partial y}{\partial t} =
\frac{\partial}{\partial t} \mathsf{lerp}\left(t^\star, t, \zeta\right) =
\mathsf{lerp}\left(
\frac{\partial t^{\star}}{\partial t},
\frac{\partial t}{\partial t},
\zeta
\right) = \zeta
$$
and
$$ \l{\frac{\partial(x, y)}{\partial(s, t)}} =
\l{\begin{matrix}1 & 0 \\ - \frac{s(1-\zeta)}{\sqrt{1-s^2}} & \zeta\end{matrix}} = \zeta.
$$
In conclusion, the Jacobian Determinant is
$$ \jac{\a{p}}{\a{r}} = \l{\frac{\partial(x, y)}{\partial(s, t)}} = \zeta = \frac{1+\hat{z}_z}{2}.$$
## Project the point on the disk in the direction of $\d{z}$ onto the hemisphere.
Project $\a{p} = (x, y) \mapsto (x, y, z)$, where
$$ z = \sqrt{1-x^2-y^2}.$$
The projection to the hemisphere has a Jacobian determinant
$$ J = \frac{1}{z}.$$
Now, $(x, y, z)$ is a local coordinate system, relative to the $(\d{x}, \d{y}, \d{z})$ frame, so in the frame
relative to the geometry normal $\d{n}$:
$$ \d{s} = x \; \d{x} + y \; \d{y} + z \; \d{z}.$$
Since the frame is orthonormal, this is a *rotation*, and since a rotation is area-preserving, the Jacobian Determinant is
$$ \jac{\d{s}}{\a{p}} = J \cdot 1 = \frac{1}{z}.$$
## Elliptic Stretch
Given $\d{s}$ on the hemisphere, we finally transform $\d{s}$ using Elliptic Stretch:
$$ \d{m} = \mathsf{normalize}\begin{bmatrix}\alpha_x \hat{s}_x\\ \alpha_y \hat{s}_y \\ \hat{s}_z\end{bmatrix} = \frac{\a{A}\d{s}}{\l{\a{A}\d{s}}}, $$
with Jacobian Determinant
$$ \jac{\d{m}}{\d{s}} =
\alpha_x \alpha_y \left( \frac{\hat{m}_x^2}{\alpha_x^2} + \frac{\hat{m}_y^2}{\alpha_y^2} + \hat{m}_z^2 \right)^{3/2}
= \alpha_x \alpha_y \l{\a{A}^{-1}\d{m}}^{3},$$
as seen before in our previous blog [post](./elliptic-stretch.html).
The inverse is:
$$ \d{s} = \frac{1}{\sqrt{\frac{\hat{m}_x^2}{\alpha_x^2} + \frac{\hat{m}_y^2}{\alpha_y^2} + \hat{m}_z^2}} \begin{bmatrix}\frac{\hat{m}_x}{\alpha_x} \\ \frac{\hat{m}_y}{\alpha_y} \\ \hat{m}_z \end{bmatrix} = \frac{\a{A}^{-1}\d{m}}{\l{\a{A}^{-1}\d{m}}}.$$
# Proof of the Probability Density Function
Thus, we have the Total Jacobian Determinant:
$$ \jac{\d{m}}{\a{u}} = \jac{\d{m}}{\d{s}} \jac{\d{s}}{\a{p}} \jac{\a{p}}{\a{r}} \jac{\a{r}}{\a{u}} =
\alpha_x \alpha_y \l{\a{A}^{-1} \d{m}}^3 \cdot \frac{1}{z} \cdot \frac{1+\hat{z}_z}{2} \cdot \pi,$$
where we notice that we need $z$ and $\hat{z}_z$ as a function of $\d{m}$ or $\d{v}$.
## Solving for $\hat{z}_z$
Notice that
$$ \d{z} = \frac{\a{A}\d{v}}{\l{\a{A}\d{v}}}, $$
and that in particular for the $z$-component,
$$ \hat{z}_z = \frac{\hat{v}_z}{\l{\a{A}\d{v}}}, $$
so $\hat{z}_z$ is a function of $\d{v}$, and
$$ \hat{z}_z^{-1} = \frac{1}{\hat{z}_z} = \frac{\l{\a{A}\d{v}}}{\hat{v}_z} =
\frac{\sqrt{\alpha_x^2\hat{v}_x^2+\alpha_y^2\hat{v}_y^2+\hat{v}_z^2}}{\sqrt{\hat{v}_z^2}} =
\sqrt{\frac{\alpha_x^2\hat{v}_x^2+\alpha_y^2\hat{v}_y^2}{\hat{v}_z^2} + 1}.$$
Also, notice how this is related to $\Lambda(\d{v})$ as defined in the introduction:
$$\Lambda(\d{v}) = -\frac{1}{2} + \frac{1}{2}\sqrt{\frac{\alpha_x^2\hat{v}_x^2+\alpha_y^2\hat{v}_y^2}{\hat{v}_z^2} + 1} = - \frac{1}{2} + \frac{1}{2}\hat{z}_z^{-1},$$
and
$$G_1(\d{v}) = \frac{1}{1+\Lambda(\d{v})} = \frac{1}{\frac{1}{2}+\frac{1}{2}\hat{z}_z^{-1}} = \left[\frac{1}{2}\left(1+\hat{z}_z^{-1}\right)\right]^{-1}
=\left[ \frac{1}{2} \cdot \frac{1 + \hat{z}_z}{\hat{z}_z} \right]^{-1} = \frac{2}{1 + \hat{z}_z} \cdot \hat{z}_z,$$
in other words
$$ \frac{1+\hat{z}_z}{2} = \frac{\hat{z}_z}{G_1(\d{v})} = \frac{\hat{v}_z}{G_1(\d{v})\cdot\l{\a{A}\d{v}}}.$$
## Solving for $z$
Notice that $\d{x}, \d{y}, \d{z}$ are orthonormal, so $\dp{\d{x}}{\d{z}} = 0$, $\dp{\d{y}}{\d{z}} = 0$ and $\dp{\d{z}}{\d{z}} = 1$, and
$$\d{s}\bullet\d{z} = \left(x\;\d{x}+y\;\d{y}+z\;\d{z}\right)\bullet\d{z} = x\cdot (\d{x}\bullet\d{z}) + y\cdot (\d{y}\bullet\d{z}) + z\cdot (\d{z}\bullet\d{z}) = z,$$
so, by noting that $\a{a}\bullet\a{b} = \a{a}^\top\a{b}$, $\left(\a{A}\a{B}\right)^\top = \a{B}^\top\a{A}^\top$ and $\a{A}^{-\top} = \a{A}^{-1}$ since $\a{A}$ is a diagonal
matrix.
$$\begin{align*}
z = \d{s}\bullet\d{z} &= \frac{\a{A}^{-1}\d{m}}{\l{\a{A}^{-1}\d{m}}} \bullet \frac{\a{A}\d{v}}{\l{\a{A}\d{v}}} \\
&= \frac{1}{\l{\a{A}^{-1}\d{m}}\cdot\l{\a{A}\d{v}}} \cdot \left( \a{A}^{-1}\d{m} \bullet \a{A}\d{v} \right) \\
&= \frac{1}{\l{\a{A}^{-1}\d{m}}\cdot\l{\a{A}\d{v}}} \cdot \left( \a{A}^{-1}\d{m} \right)^\top \a{A}\d{v} \\
&= \frac{1}{\l{\a{A}^{-1}\d{m}}\cdot\l{\a{A}\d{v}}} \cdot \d{m}^\top \a{A}^{-\top} \a{A} \d{v} \\
&= \frac{1}{\l{\a{A}^{-1}\d{m}}\cdot\l{\a{A}\d{v}}} \cdot \d{m}^\top \d{v} \\
&= \frac{\dp{\d{m}}{\d{v}}}{\l{\a{A}^{-1}\d{m}}\cdot\l{\a{A}\d{v}}}
\end{align*},$$
or
$$ \frac{1}{z} = \frac{\l{\a{A}^{-1}\d{m}} \cdot \l{\a{A}\d{v}}}{\d{m}\bullet\d{v}}
$$
## Total Jacobian Determinant
Going back to the total Jacobian Determinant, that means that
$$\begin{align*} \jac{\d{m}}{\a{u}} &=
\alpha_x \alpha_y \l{\a{A}^{-1} \d{m}}^3 \cdot
\frac{1}{z} \cdot \frac{1+\hat{z}_z}{2} \cdot \pi \\
&=
\alpha_x \alpha_y \l{\a{A}^{-1}\d{m}}^3 \cdot
\frac{\l{\a{A}^{-1}\d{m}} \cdot \l{\a{A}\d{v}}}{\d{m}\bullet\d{v}}
\cdot \frac{1+\hat{z}_z}{2} \cdot \pi \tag{introducing $1/z$} \\
&=
\alpha_x \alpha_y \l{\a{A}^{-1}\d{m}}^4 \cdot
\frac{\l{\a{A}\d{v}}}{\d{m}\bullet\d{v}}
\cdot \frac{\hat{v}_z}{G_1(\d{v})\cdot\l{\a{A}\d{v}}} \cdot \pi \tag{introducing $(1+\hat{z}_z)/2$} \\
&= \pi\alpha_x\alpha_y \l{\a{A}^{-1}\d{m}}^4 \cdot
\frac{1}{\d{m}\bullet\d{v}} \cdot
\frac{\d{v}\bullet\d{n}}{G_1(\d{v})} \tag{simplify and $\hat{v}_z = \d{v}\bullet\d{n}$}
\end{align*} $$
Finally, since $p_\a{u}(\a{u}) = 1$, we get
$$ p_\mathsf{vGGX}(\d{m}) = p_\a{u}(\a{u})\; \jac{\d{m}}{\a{u}}^{-1} = D(\d{m}) \cdot \left(\d{m}\bullet\d{v}\right) \cdot \frac{G_1(\d{v})}{\d{v}\bullet\d{n}}, $$
where
$$ D(\d{m}) = \frac{1}{\pi\alpha_x\alpha_y\l{\a{A}^{-1}\d{m}}^4} =
\frac{1}{\pi\alpha_x\alpha_y\left(
\frac{\hat{m}_x^2}{\alpha_x^2} +
\frac{\hat{m}_y^2}{\alpha_y^2} +
\hat{m}_z^2\right)^2}, $$
and the proof is complete.
## Notes
The proof relies on a assumption that Walter et al. do not, in that we rely on $\a{A}$ being a diagonal matrix, of the form $\a{A} = \mathsf{diag}(\alpha_x, \alpha_y, 1)$.
It is future work to see if this assumption can be relaxed to account for their more general form $\a{A} = \a{S}\a{R}$, where $\a{S}$ is a diagonal matrix and $\a{R}$ is a
rotation matrix.
# Visualization
Below, we see $p_\mathsf{vGGX}$ using $\alpha_x = 0.15, \alpha_y = 0.5$ and $\theta_v = 75^\circ, \phi_v = 0^\circ$, showing how the distribution is skewed towards the view.
(The gray area represents the region of micro-facet normals that when reflected with the view-vector produce a incident light-direction in the upper hemisphere.)
![Figure [d1a-top]: Equal area from top](vggx-top-area.png) ![Figure [d1a-side]]: Equal area from side](vggx-side-area.png)
![Figure [d1-top]: PDF from top](vggx-top.png) ![Figure [d1-side]: PDF from side](vggx-side.png)
If we change $\phi_v$ to $90^\circ$, we get a bigger effect, since they $y$-direction has a higher roughness value.
![Figure [d2a-top]: Equal area from top](vggx2-top-area.png) ![Figure [d2a-side]]: Equal area from side](vggx2-side-area.png)
![Figure [d2-top]: PDF from top](vggx-below-top.png) ![Figure [d2-side]: PDF from side](vggx-below-side.png)
# References
[#Walter15]: WALTER, B., DONG, Z., MARSCHNER, S., AND GREENBERG, D. 2015. The Ellipsoid Normal Distribution Function. Supplemental material of Predicting Appearance from Measured Microgeometry of Metal Surfaces, ACM Trans. Graph. 35, 4, 9:1–9:13. May 2016 updated version. URL: http://pdfs.semanticscholar.org/fc79/f6c7340937c2589cf22165f2c4fc9acb2ba2.pdf.
[#Heitz18]: HEITZ, E., 2018. Sampling the GGX Distribution of Visible Normals, Journal of Computer Graphics Techniques (JCGT), vol. 7, no. 4, 1–13, 2018. URL: http://jcgt.org/published/0007/04/01/