Statistical-Based PoS Tagging - Using Hidden Markov Model (HMM)

given a sentence of words {𝑤₁, …, 𝑤_𝑛} we want to assign PoS Tags to each word {𝑡₁, …, 𝑡_𝑛} such that the probability 𝐏(𝑡₁, …, 𝑡_𝑛|𝑤₁, …, 𝑤_𝑛) is the highest:

𝑡̂₁, …, 𝑡̂_𝑛= 𝑎𝑟𝑔𝑚𝑎𝑥_{𝑡₁, …, 𝑡_𝑛}[𝐏(𝑡₁, …, 𝑡_𝑛|𝑤₁, …, 𝑤_𝑛)]
𝑡̂₁, …, 𝑡̂_𝑛= 𝑎𝑟𝑔𝑚𝑎𝑥_{𝑡₁, …, 𝑡_𝑛}[𝐏(𝑤₁, …, 𝑤_𝑛|𝑡₁, …, 𝑡_𝑛)𝐏(𝑡₁, …, 𝑡_𝑛)/𝐏(𝑤₁, …, 𝑤_𝑛)] # via Bayes Rule
𝑡̂₁, …, 𝑡̂_𝑛= 𝑎𝑟𝑔𝑚𝑎𝑥_{𝑡₁, …, 𝑡_𝑛}[𝐏(𝑤₁, …, 𝑤_𝑛|𝑡₁, …, 𝑡_𝑛)𝐏(𝑡₁, …, 𝑡_𝑛)] # 𝐏(𝑤₁, …, 𝑤_𝑛) is a constant w.r.t. the argmax values
𝑡̂₁, …, 𝑡̂_𝑛= 𝑎𝑟𝑔𝑚𝑎𝑥_{𝑡₁, …, 𝑡_𝑛}[𝛱_{1≤𝑖≤𝑛}𝐏(𝑤_𝑖|𝑡_𝑖) · 𝐏(𝑡₁)·[𝛱_{2≤𝑖≤𝑛}𝐏(𝑡_𝑖|𝑡_𝑖-1)]]
- 𝐏(𝑤₁, …, 𝑤_𝑛|𝑡₁, …, 𝑡_𝑛) ≈ 𝛱_{1≤𝑖≤𝑛}𝐏(𝑤_𝑖|𝑡_𝑖) # 𝑤_𝑖is conditionally independent from all else when given 𝑡_𝑖
- 𝐏(𝑡₁, …, 𝑡_𝑛) ≈ 𝐏(𝑡₁)·[𝛱_{2≤𝑖≤𝑛}𝐏(𝑡_𝑖|𝑡_𝑖-1)] # 𝑡_𝑖 is conditionally independent from all else when given 𝑡_𝑖-1
𝑡̂₁, …, 𝑡̂_𝑛= 𝑎𝑟𝑔𝑚𝑎𝑥_{𝑡₁, …, 𝑡_𝑛}[𝐏(𝑡₁)𝐏(𝑤₁|𝑡₁) · [𝛱_{2≤𝑖≤𝑛}𝐏(𝑡_𝑖|𝑡_𝑖-1)𝐏(𝑤_𝑖|𝑡_𝑖)]]

with respect to Hidden Markov Models (HMM):

𝐏(𝑡_𝑖|𝑡_𝑖-1) - are transition probabilities (in our case tag transition probabilities)
𝐏(𝑤_𝑖|𝑡_𝑖) - are emission probabilities (in our case word emission probabilities)

Learning Transition & Emission Probabilities From Training Corpus

see: Learning/Training Section of Hidden Markov Models (HMM)

／var／log marcus chiu

Explorer

Statistical-Based PoS Tagging - Using Hidden Markov Model (HMM)

Learning Transition & Emission Probabilities From Training Corpus

／var／logmarcus chiu

Explorer

Statistical-Based PoS Tagging - Using Hidden Markov Model (HMM)

Learning Transition & Emission Probabilities From Training Corpus

／var／log marcus chiu