|
Bayes’ Rule in Probability Form |
Bayes’ Rule in Odds Form |
|---|---|
|
𝐏(𝑋|𝐸) = 𝐏(𝑋)𝐏(𝐸|𝑋) / 𝐏(𝐸) |
𝐎(𝑋|𝐸) = 𝐎(𝑋)[𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟] |
|
𝐏(𝑋|𝐸1,𝐸2) = 𝐏(𝑋)𝐏(𝐸1,𝐸2|𝑋) / 𝐏(𝐸1,𝐸2) |
𝐎(𝑋|𝐸1,𝐸2) = 𝐎(𝑋)[𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟 1][𝐵𝑎𝑦𝑒𝑠 𝐹𝑎𝑐𝑡𝑜𝑟 2] |
Bayes’ Theorem - Diagram
with: prior and posterior probabilities and normalization constant
/bayes'-theorem.jpg)
Bayes’ Theorem - Derivation
the product rule can be written in 2 ways:
- 𝐏(𝑌,𝑋) = 𝐏(𝑋|𝑌)𝐏(𝑌)
- 𝐏(𝑌,𝑋) = 𝐏(𝑌|𝑋)𝐏(𝑋)
equating the two right-hand sides, we get Bayes’ Rule/Theorem/Law:
- 𝐏(𝑌|𝑋) = 𝐏(𝑋|𝑌)𝐏(𝑌) / 𝐏(𝑋)
a more general version conditionalized on some background evidence e:
- 𝐏(𝑌|𝑋,e) = 𝐏(𝑋|𝑌,e) 𝐏(𝑌|e) / 𝐏(𝑋|e)
Bayes Theorem allows us to update our belief in a distribution (over one or more variables) 𝑸 = {𝑄1, …, 𝑄𝑛}, in the light of new evidence 𝒆 = {𝑒1, …, 𝑒𝑛}
- 𝐏(𝑸|𝒆) = 𝐏(𝒆|𝑸)𝐏(𝑸) / 𝐏(𝒆)
expanded form:
- 𝐏(𝑄1, …, 𝑄𝑛|𝑒1, …, 𝑒𝑛) = [𝐏(𝑒1, …, 𝑒𝑛|𝑄1, …, 𝑄𝑛)𝐏(𝑄1, …, 𝑄𝑛)] / [𝛴𝑄1 … 𝛴𝑄𝑛𝐏(𝑒1, …, 𝑒𝑛|𝑄1, …, 𝑄𝑛)𝐏(𝑄1, …, 𝑄𝑛)]
|
Term |
Name |
Description |
|---|---|---|
|
𝐏(𝑸) |
|
|
|
𝐏(𝑸|𝒆) |
|
|
|
𝐏(𝒆|𝑸) |
|
|
|
𝐏(𝒆) |
|
|
Posterior vs Prior
note that the prior is a weighted average of the posteriors, weighted over different instantiations of the evidence:
- 𝐏(𝑸) = 𝛴𝒆∊𝑬𝐏(𝑸|𝑬=𝒆)𝐏(𝑬=𝒆)
- 𝑝𝑟𝑖𝑜𝑟 = 𝛴𝒆∊𝑬[𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟*𝑤𝑒𝑖𝑔ℎ𝑡]
- 𝑝𝑟𝑖𝑜𝑟 = 𝛴𝒆∊𝑬[𝑤𝑒𝑖𝑔ℎ𝑡𝑒𝑑-𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟]
thus:
- if the evidence 𝐏(𝑬=𝒆) is LIKELY, then:
- posterior 𝐏(𝑸|𝑬=𝒆) is a major component in this summation
- posterior 𝐏(𝑸|𝑬=𝒆) is probably not too far from the prior 𝐏(𝑸)
- if the evidence 𝐏(𝑬=𝒆) is UNLIKELY, then:
- posterior 𝐏(𝑸|𝑬=𝒆) is negligible component in this summation
- posterior 𝐏(𝑸|𝑬=𝒆) is not constrained to be similar to the prior 𝐏(𝑸)