given a sentence of words {π€1, β¦, π€π} we want to assign PoS Tags to each word {π‘1, β¦, π‘π} such that the probability π(π‘1, β¦, π‘π|π€1, β¦, π€π) is the highest:
- π‘Μ1, β¦, π‘Μπ= ππππππ₯π‘1, β¦, π‘π[π(π‘1, β¦, π‘π|π€1, β¦, π€π)]
- π‘Μ1, β¦, π‘Μπ= ππππππ₯π‘1, β¦, π‘π[π(π€1, β¦, π€π|π‘1, β¦, π‘π)π(π‘1, β¦, π‘π)/π(π€1, β¦, π€π)] # viaΒ Bayes Rule
- π‘Μ1, β¦, π‘Μπ= ππππππ₯π‘1, β¦, π‘π[π(π€1, β¦, π€π|π‘1, β¦, π‘π)π(π‘1, β¦, π‘π)] #Β π(π€1, β¦, π€π) is a constant w.r.t. the argmax values
- π‘Μ1, β¦, π‘Μπ= ππππππ₯π‘1, β¦, π‘π[π±1β€πβ€ππ(π€π|π‘π) Β·Β π(π‘1)Β·[π±2β€πβ€ππ(π‘π|π‘π-1)]]
- π(π€1, β¦, π€π|π‘1, β¦, π‘π) βΒ π±1β€πβ€ππ(π€π|π‘π)Β # π€πis conditionally independent from all else when givenΒ π‘π
- π(π‘1, β¦, π‘π) β π(π‘1)Β·[π±2β€πβ€ππ(π‘π|π‘π-1)]Β #Β π‘π is conditionally independent from all else when givenΒ π‘π-1
- π‘Μ1, β¦, π‘Μπ= ππππππ₯π‘1, β¦, π‘π[π(π‘1)π(π€1|π‘1)Β Β·Β [π±2β€πβ€ππ(π‘π|π‘π-1)π(π€π|π‘π)]]
with respect toΒ Hidden Markov Models (HMM):
- π(π‘π|π‘π-1)Β - are transition probabilities (in our case tag transition probabilities)
- π(π€π|π‘π)Β - are emission probabilities (in our case word emission probabilities)
Learning Transition & Emission Probabilities From Training Corpus
see: Learning/Training Section ofΒ Hidden Markov Models (HMM)