Metropolis-Hashtings Algorithm is a Markov Chain Monte Carlo (MCMC) method for obtaining a sequence
of random samples which converge to being distributed according to a target probability distribution
for which direct sampling is difficult.

Introduction

This article assumes the reader understands the following:

  • Markov Chains & its properties:
    • ergodic property
    • stationary distribution
    • detailed balanced
  • Monte Carlo Methods

Problem:

  • approximate or sample from target distribution 𝜋

Solution:

  • Markov Chain idea: given an ergodic transition matrix 𝑇 there exists a stationary distribution 𝜋
  • Metropolis Algorithm idea: given a target distribution 𝜋 construct an ergodic transition matrix 𝑇 that will produce 𝜋
    • the ergodic theorem states that sampling from this Markov chain 𝑇 will approximate the target distribution 𝜋

Metropolis Algorithm:

  1. given current state 𝑥, sample next state 𝑥’ from a proposal distribution 𝐐(𝑥 → 𝑥’)
  2. accept next state 𝑥’ with acceptance probability 𝐀(𝑥 → 𝑥’)
    • if accepted, move to 𝑥’
    • otherwise stay at 𝑥
  3. repeat 𝑛 number of times

From the algorithm, the transition function 𝑇 is defined as:

if 𝑥 ≠ 𝑥’:

if 𝑥 = 𝑥’:

construct acceptance probability 𝐀 such that detailed balance holds for 𝑇
(detailed balance implies stationary distribution, and thus the ergodic theorem above applies):

1 by definition of detailed balanced

5 because 𝜋(𝑥) is assumed to be hard to compute because of its normalizing constant 1/𝑍, we can rewrite it as

6 the normalizing constants are cancelled out, rendering 𝐀(𝑥 → 𝑥’) to be easy to compute

Choice of Proposal Distribution 𝐐

𝐐 must be reversible:

  • 𝐐(𝑥’ → 𝑥) > 0 implies 𝐐(𝑥 → 𝑥’) > 0

opposing forces:

  • 𝐐 should be spread out to improve mixing
  • but then acceptance probability will be low, which in turn slows down mixing

Random Walk Metropolis

If 𝐐 is chosen to be symmetric (i.e. 𝐐(𝑥 → 𝑥’) = 𝐐(𝑥’ → 𝑥) for all 𝑥’ and 𝑥), then the acceptance probability 𝐀 becomes:

now:

  • if 𝑥’ has density greater than or equal to 𝑥‘s density (i.e. 𝜋(𝑥’) ≥ 𝜋(𝑥)) then we will always accept 𝑥’
  • if 𝑥’ has density less than 𝑥‘s density (i.e. 𝜋(𝑥’) < 𝜋(𝑥)) then we will always accept 𝑥’ with probability

Independent Metropolis-Hastings Proposal

If 𝐐 is chosen to be independent (i.e. 𝐐(𝑥 → 𝑥’) = 𝐐(𝑥’)), then the acceptance probability becomes:

Resources

0 items under this folder.