Rule-Based PoS Tagging
  • a type of Part-of-Speech (PoS) Tagging utilizing rules
  • rules cannot be auto-learned by feeding it training data (requires manual creation)

Process

given:

  • dictionary of words with their corresponding PoS Tags (a single word may have multiple tags)
  • set of rules that selectively remove PoS Tags
  • input sentence to be PoS Tagged

steps:

  • according to the dictionary, assign each word of the input sentence all possible PoS Tags
  • according to the set of rules, remove tags from word of the sentence, until each word has exactly one PoS Tag

Example

given input sentence:

  • She promised to back the bill

according to the dictionary, we assign each word their PoS Tags:

She

promised

to

back

the

bill

PRP

VBN
VBD

TO

NN
RB
JJ
VB

DT

VB
NN

according to the set of rules, we remove tags

say we come across a rule like below:

  • Eliminate VBN if VBD is an option when VBN|VBD follows “<start> PRP”

we then remove VBN from under the promised word like so:

She

promised

to

back

the

bill

PRP

VBD

TO

NN
RB
JJ
VB

DT

VB
NN

we continue until each word has exactly one PoS Tag