Blog

Saturday, Jan 4, 2024

Ultra Fast Simulation from a Multivariate Normal Distribution

Sampling distribution over high-dimensional state-space has recently attracted a lot of research efforts in computational statistics and machine learning community.

Most of the sampling techniques known so far do not scale to high-dimension and become impractical when the number of data samples is huge (big data).

The Metropolis Adjusted Langevin Algorithm is a more general purpose markov chain monte carlo algorithm for sampling from a differentiable target density.

Introduction

  1. Locate where the .RData file was saved.
    1. On a MacBook, this can be done by right-clicking on the file and choosing the 'Get Info' option.
      locating_file_macbook_get_info.png

Review of Metropolis-Hastings Algorithm

The Metropolis-Hastings (MH) algorithm is routinely used to generate samples from a target distribution, denoted $p(\mathbf{x})$, that may otherwise be difficult to sample from (Metropolis et al. 1953 1; Hastings 19702). The MH algorithm works by generating a Markov Chain, whose stationary distribution is the target distribution. This means that, in the long run, the samples from the Markov chain will look like the samples from the target distribution, i.e., we use a Markov chain to generate a sequence of vectors, denoted $(\mathbf{x}_0, \mathbf{x}_1, \ldots, \mathbf{x}_M)$ such that as $M \rightarrow \infty, \mathbf{x}_M \sim p(\mathbf{x})$. The sequential steps involved in MH algorithm are as follows:

  1. Initialize the vector $\mathbf{x}_0 \in \mathbb{R}^n$.
  2. For $m = 1, 2, \ldots, M,$ to generate the next vector in the chain $\mathbf{x}_m \in \mathbb{R}^n$, sample a candidate vector $\mathbf{x}^*$ from a proposal distribution $q(\mathbf{x})$ which we know how to sample from.
  3. Compute the acceptance probability using the formula: $$A = \min \left\{1, \frac{p(\mathbf{x}^*) q(\mathbf{x}_{m - 1} | \mathbf{x}^*)}{p(\mathbf{x}_{m - 1}) q(\mathbf{x}^* | \mathbf{x}_{m - 1})} \right\}.$$ When $q(\mathbf{x})$ is symmetric, the formula simplifies to: $$A = \min \left\{1, \frac{p(\mathbf{x}^*)}{p(\mathbf{x}_{m - 1})} \right\}.$$
  4. Generate a uniformly distributed random number between 0 and 1, denoted $u$.
  5. The acceptance or rejection decision of the candidate vector $\mathbf{x}^*$ as a new member of the chain can be done using the following criteria: $$\mathbf{x}_m = \begin{cases} \mathbf{x}^*, \quad \quad \; \text{ if }u \leq A, \\ \mathbf{x}_{m - 1}, \quad \text{ otherwise.} \end{cases}$$

Implementing a Metropolis-Hastings Algorithm in R

Suppose we want to generate a 2-dimensional (2D) spatial Gaussian random field from an exponential spatial covariance function model on a $50 \times 50$ grid on the unit square similar to the 2D spatial Gaussian random field in Figure 1.

sample_Gaussian_random_field.pdf
Figure 1: Sample 2D spatial Gaussian random field on a 50 $\times$ 50 grid on the unit square generated from an exponential spatial covariance function model.

The values in Figure 1 can be generated with only a line of code using the function rmvn() in the R package mgcv. This is because generating values from the multivariate Gaussian distribution is easy and can be accomplished in a few steps. The rmvn() function performs these steps internally and returns the generated values from the multivariate Gaussian distribution.

Another function that also generates samples from the multivariate Gaussian is the function mvrnorm() in the R package MASS.


                closest_float_index <- which(DistanceKmFromRefLoc == min(DistanceKmFromRefLoc))

                par(mar = c(4,4,1,1))
   
                plot(0, 0, type = 'n', xlim = range(TemperatureResidualsList), ylim = c(2000, 0),
                  xlab = 'Temperature Residuals', ylab = 'Depth (meters)')
                for(prof_index in 1:length(profile_num_unique)){
                  lines(TemperatureResidualsList[[prof_index]], PressureList[[prof_index]],
                    col = 'gray', lwd = 0.1)
                } 

                lines(TemperatureResidualsList[[closest_float_index]], 
                  PressureList[[closest_float_index]], col = 'red', lwd = 2)
	      

parameterized by $\boldsymbol{\theta} = (a, \sigma^2)$, where $a > 0, \sigma^2>0$ are the spatial range and variance parameters, respectively. That is, we want to generate $\mathbf{Z}\sim \mathcal{N}_{n} \{\boldsymbol{\mu}, \boldsymbol{\Sigma}(\boldsymbol{\theta})\}$, such that $n$ is the number of locations, e.g., $n = 14$,$400$, $\mathbf{Z} = \{ Z (\mathbf{s}_1), \ldots, Z (\mathbf{s}_{n}) \}^{\top} \in \mathbb{R}^{n}$, where $Z (\mathbf{s})$ is the observation at location $\mathbf{s} \in \mathbb{R}^{d}$, with dimension $d$, e.g., $d = 2$, $\boldsymbol{\mu} = \mathbf{0} \in \mathbb{R}^{n \times 1}$, and $\boldsymbol{\Sigma}(\boldsymbol{\theta}) \in \mathbb{R}^{n \times n}$ such that entry at index~$(i, j)$ is

1. Metropolis, Nicholas and Rosenbluth, Arianna W and Rosenbluth, Marshall N and Teller, Augusta H and Teller, Edward (1953). "Equation of state calculations by fast computing machines." The Journal of Chemical Physics, 21(6), 1087-1092. https://doi.org/10.1063/1.1699114.

2. Hastings, W. K. (1970). "Monte Carlo sampling methods using Markov chains and their applications." Biometrika, 57(1), 97-109. https://doi.org/10.1093/biomet/57.1.97.