Bayes’ Rule in Probability Form

Bayes’ Rule in Odds Form

𝐏(𝑋|𝐸) = 𝐏(𝑋)𝐏(𝐸|𝑋) / 𝐏(𝐸)
𝐏(𝑋|𝐸) = (𝑃𝑟𝑖𝑜𝑟)(𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦) / [(𝑃𝑟𝑖𝑜𝑟)(𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦) + (1 - 𝑃𝑟𝑖𝑜𝑟)(𝐹𝑃𝑅)]

𝐎(𝑋|𝐸) = 𝐎(𝑋)[𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟]
𝐎(𝑋|𝐸) = 𝐎(𝑋)[𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦/𝐹𝑃𝑅]

𝐏(𝑋|𝐸1,𝐸2) = 𝐏(𝑋)𝐏(𝐸1,𝐸2|𝑋) / 𝐏(𝐸1,𝐸2)

𝐎(𝑋|𝐸1,𝐸2) = 𝐎(𝑋)[𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟 1][𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟 2]

Bayes’ Theorem - Diagram

with: prior and posterior probabilities and normalization constant

Bayes’ Theorem - Derivation

the product rule can be written in 2 ways:

  • 𝐏(𝑌,𝑋) = 𝐏(𝑋|𝑌)𝐏(𝑌)
  • 𝐏(𝑌,𝑋) = 𝐏(𝑌|𝑋)𝐏(𝑋)

equating the two right-hand sides, we get Bayes’ Rule/Theorem/Law:

  • 𝐏(𝑌|𝑋) = 𝐏(𝑋|𝑌)𝐏(𝑌) / 𝐏(𝑋)

a more general version conditionalized on some background evidence e:

  • 𝐏(𝑌|𝑋,e) = 𝐏(𝑋|𝑌,e) 𝐏(𝑌|e) / 𝐏(𝑋|e)

Bayes Theorem allows us to update our belief in a distribution (over one or more variables) 𝑸 = {𝑄1, …, 𝑄𝑛}, in the light of new evidence 𝒆 = {𝑒1, …, 𝑒𝑛}

  • 𝐏(𝑸|𝒆) = 𝐏(𝒆|𝑸)𝐏(𝑸) / 𝐏(𝒆)

expanded form:

  • 𝐏(𝑄1, …, 𝑄𝑛|𝑒1, …, 𝑒𝑛) = [𝐏(𝑒1, …, 𝑒𝑛|𝑄1, …, 𝑄𝑛)𝐏(𝑄1, …, 𝑄𝑛)] / [𝛴𝑄1 … 𝛴𝑄𝑛𝐏(𝑒1, …, 𝑒𝑛|𝑄1, …, 𝑄𝑛)𝐏(𝑄1, …, 𝑄𝑛)]

Term

Name

Description

𝐏(𝑸)

  • prior of 𝑸

𝐏(𝑸|𝒆)

  • posterior of 𝑸 given 𝒆

𝐏(𝒆|𝑸)
𝐋(𝑸|𝒆)

  • likelihood of 𝑸 given 𝒆

𝐏(𝒆)

  • normalizes 𝐏(𝑸|𝒆) so that the resulting probability sums to 1

Posterior vs Prior

note that the prior is a weighted average of the posteriors, weighted over different instantiations of the evidence:

  • 𝐏(𝑸) = 𝛴𝒆∊𝑬𝐏(𝑸|𝑬=𝒆)𝐏(𝑬=𝒆)
  • 𝑝𝑟𝑖𝑜𝑟 = 𝛴𝒆∊𝑬[𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟*𝑤𝑒𝑖𝑔ℎ𝑡]
  • 𝑝𝑟𝑖𝑜𝑟 = 𝛴𝒆∊𝑬[𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑-𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟]

thus:

  • if the evidence 𝐏(𝑬=𝒆) is LIKELY, then:
    • posterior 𝐏(𝑸|𝑬=𝒆) is a major component in this summation
    • posterior 𝐏(𝑸|𝑬=𝒆) is probably not too far from the prior 𝐏(𝑸)
  • if the evidence 𝐏(𝑬=𝒆) is UNLIKELY, then:
    • posterior 𝐏(𝑸|𝑬=𝒆) is negligible component in this summation
    • posterior 𝐏(𝑸|𝑬=𝒆) is not constrained to be similar to the prior 𝐏(𝑸)

Bayes Box - How Posterior Probability is Updated by Prior Probability and Observed Data

Subpages