Bayes’ Rule in Probability Form	Bayes’ Rule in Odds Form
𝐏(𝑋\|𝐸) = 𝐏(𝑋)𝐏(𝐸\|𝑋) / 𝐏(𝐸) 𝐏(𝑋\|𝐸) = (𝑃𝑟𝑖𝑜𝑟)(𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦) / [(𝑃𝑟𝑖𝑜𝑟)(𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦) + (1 - 𝑃𝑟𝑖𝑜𝑟)(𝐹𝑃𝑅)]	𝐎(𝑋\|𝐸) = 𝐎(𝑋)[𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟] 𝐎(𝑋\|𝐸) = 𝐎(𝑋)[𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦/𝐹𝑃𝑅]
𝐏(𝑋\|𝐸₁,𝐸₂) = 𝐏(𝑋)𝐏(𝐸₁,𝐸₂\|𝑋) / 𝐏(𝐸₁,𝐸₂)	𝐎(𝑋\|𝐸₁,𝐸₂) = 𝐎(𝑋)[𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟 1][𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟 2]

Bayes’ Theorem - Diagram

with: prior and posterior probabilities and normalization constant

Bayes’ Theorem - Derivation

the product rule can be written in 2 ways:

𝐏(𝑌,𝑋) = 𝐏(𝑋|𝑌)𝐏(𝑌)
𝐏(𝑌,𝑋) = 𝐏(𝑌|𝑋)𝐏(𝑋)

equating the two right-hand sides, we get Bayes’ Rule/Theorem/Law:

𝐏(𝑌|𝑋) = 𝐏(𝑋|𝑌)𝐏(𝑌) / 𝐏(𝑋)

a more general version conditionalized on some background evidence e:

𝐏(𝑌|𝑋,e) = 𝐏(𝑋|𝑌,e) 𝐏(𝑌|e) / 𝐏(𝑋|e)

Bayes Theorem allows us to update our belief in a distribution (over one or more variables) 𝑸 = {𝑄₁, …, 𝑄_𝑛}, in the light of new evidence 𝒆 = {𝑒₁, …, 𝑒_𝑛}

𝐏(𝑸|𝒆) = 𝐏(𝒆|𝑸)𝐏(𝑸) / 𝐏(𝒆)

expanded form:

𝐏(𝑄₁, …, 𝑄_𝑛|𝑒₁, …, 𝑒_𝑛) = [𝐏(𝑒₁, …, 𝑒_𝑛|𝑄₁, …, 𝑄_𝑛)𝐏(𝑄₁, …, 𝑄_𝑛)] / [𝛴_𝑄₁ … 𝛴_{𝑄_𝑛}𝐏(𝑒₁, …, 𝑒_𝑛|𝑄₁, …, 𝑄_𝑛)𝐏(𝑄₁, …, 𝑄_𝑛)]

Term	Name	Description
𝐏(𝑸)	prior of 𝑸	is a probability distribution or joint distribution that represents your uncertainty over 𝑸 before you have sampled any data/evidence and attempted to estimate it
𝐏(𝑸\|𝒆)	posterior of 𝑸 given 𝒆	is a conditional probability distribution representing your uncertainty over 𝑸 after you have sampled data/evidence
𝐏(𝒆\|𝑸) 𝐋(𝑸\|𝒆)	likelihood of 𝑸 given 𝒆	can be defined in 2 ways: Probability 𝐏(𝒆\|𝑸) vs Likelihood 𝐋(𝑸\|𝒆) given 𝒆, 𝐏(𝒆\|𝑸) is a measure of how likely 𝑸 caused 𝒆
𝐏(𝒆)	probability of evidence normalization constant/factor	normalizes 𝐏(𝑸\|𝒆) so that the resulting probability sums to 1

Posterior vs Prior

note that the prior is a weighted average of the posteriors, weighted over different instantiations of the evidence:

𝐏(𝑸) = 𝛴_𝒆∊𝑬𝐏(𝑸|𝑬=𝒆)𝐏(𝑬=𝒆)
𝑝𝑟𝑖𝑜𝑟 = 𝛴_𝒆∊𝑬[𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟*𝑤𝑒𝑖𝑔ℎ𝑡]
𝑝𝑟𝑖𝑜𝑟 = 𝛴_𝒆∊𝑬[𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑-𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟]

thus:

if the evidence 𝐏(𝑬=𝒆) is LIKELY, then:
- posterior 𝐏(𝑸|𝑬=𝒆) is a major component in this summation
- posterior 𝐏(𝑸|𝑬=𝒆) is probably not too far from the prior 𝐏(𝑸)
if the evidence 𝐏(𝑬=𝒆) is UNLIKELY, then:
- posterior 𝐏(𝑸|𝑬=𝒆) is negligible component in this summation
- posterior 𝐏(𝑸|𝑬=𝒆) is not constrained to be similar to the prior 𝐏(𝑸)

Bayes Box - How Posterior Probability is Updated by Prior Probability and Observed Data

Bayes’ Rule in Probability Form	Bayes’ Rule in Odds Form
𝐏(𝑋\|𝐸) = 𝐏(𝑋)𝐏(𝐸\|𝑋) / 𝐏(𝐸) 𝐏(𝑋\|𝐸) = (𝑃𝑟𝑖𝑜𝑟)(𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦) / [(𝑃𝑟𝑖𝑜𝑟)(𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦) + (1 - 𝑃𝑟𝑖𝑜𝑟)(𝐹𝑃𝑅)]	𝐎(𝑋\|𝐸) = 𝐎(𝑋)[𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟] 𝐎(𝑋\|𝐸) = 𝐎(𝑋)[𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦/𝐹𝑃𝑅]
𝐏(𝑋\|𝐸₁,𝐸₂) = 𝐏(𝑋)𝐏(𝐸₁,𝐸₂\|𝑋) / 𝐏(𝐸₁,𝐸₂)	𝐎(𝑋\|𝐸₁,𝐸₂) = 𝐎(𝑋)[𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟 1][𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟 2]

／var／log marcus chiu

Explorer

Bayes' Rule／Theorem／Law (Prior - Posterior - Distribution - Likelihood - Probability of Evidence)

Bayes’ Theorem - Diagram

Bayes’ Theorem - Derivation

Posterior vs Prior

Bayes Box - How Posterior Probability is Updated by Prior Probability and Observed Data

Subpages

／var／logmarcus chiu

Explorer

Bayes' Rule／Theorem／Law (Prior - Posterior - Distribution - Likelihood - Probability of Evidence)

Bayes’ Theorem - Diagram

Bayes’ Theorem - Derivation

Posterior vs Prior

Bayes Box - How Posterior Probability is Updated by Prior Probability and Observed Data

Subpages

／var／log marcus chiu