-
π - set of possible classes
-
π - vector of input attributes ππβs
-
ππ - is an input attribute ofΒ πΒ at index π (either discrete and/or continuous)
-
π¦ - class value
-
π₯ - vector of input attribute values π₯πβs
-
π₯π - is an input attribute value of π₯ at index π
Probability Rule
- π(π=π¦|π=π₯) = π(π=π¦)π(π=π₯|π=π¦)Β /Β π(π=π₯)
- π(π=π¦|π=π₯)Β βΒ π(π=π¦)π(π=π₯|π=π¦)
conditional independence states that π(π΄π΅|π) = π(π΄|π΅π)π(π΅|π) = π(π΄|π)π(π΅|π)
- π(π=π¦|π=π₯)Β β π(π=π¦) π±π₯πβππ(ππ=π₯π|π=π¦)
Therefore given input values π we calculate π(π=π¦|π=π₯)Β for each π¦ in π, and class π¦ with highest probability is βassignedβ to π
- π βΒ ππππππ₯π¦Β [ π(π=π¦) π±π₯πβππ(ππ=π₯π|π=π¦) ]
Learning From Training Set
- estimate π(π=π¦)Β for each possible π¦
- estimate π(ππ=π₯π|π=π¦)Β for each possible π¦
EstimatingΒ π(π=π¦)
- π(π=π¦)Β = πππ’ππ‘(π=π¦) / total-training-examples
- π(π=π¦)Β may equal 0, we should smooth it:
- π(π=π¦)Β = [πππ’ππ‘(π=π¦) + π] / [total-training-examples + (π * ππ’π-ππππ π ππ -ππ-π)]
EstimatingΒ π(ππ=π₯π|π=π¦)
when input feature ππis a discrete variable (either aΒ bernoulli or multinoulli)
- π(ππ=π₯π|π=π¦)Β = πππ’ππ‘(ππ=π₯π,π=π¦) / πππ’ππ‘(π=π¦)
- π(ππ=π₯π|π=π¦)Β estimate may equal 0, we should smooth it:
- π(ππ=π₯π|π=π¦)Β = [πππ’ππ‘(ππ=π₯π,π=π¦)Β + π] / [πππ’ππ‘(π=π¦) + (π * ππ’π-ππππ π ππ -ππ-ππ)]
when input feature ππis a continuous variable having a gaussian distribution
- π(ππ=π₯π|π=π¦)Β = 1/[πβ(2π)] * π-(π₯-π)Β²/(2*πΒ²)
- πΒ = 1/πππ’ππ‘(π=π¦) * π΄ π₯π for each training example where its class π = π¦
- π£πππππππ = 1/[πππ’ππ‘(π=π¦) - 1] * π΄(π₯π - π)Β² for each training example where its classΒ πΒ = π¦
- πΒ = βπ£πππππππ