Top-Down Search/Parsing

Bottom-Up Search/Parsing

since we’re trying to find trees rooted with an S, why not start with the rules that give us an S, then work our way down to the terminal words

we also want trees that cover the input word. So start with trees that link up words in the right way, then work our way up to form larger trees

only searches for trees that can be answers

only forms trees consistent with the input words

suggests trees that are not consistent with any of the words

suggest trees that make no sense globally

rules are applied from left to right

rules are applied from right to left

  • S
  • NP VP
  • NAME VP
  • John VP
  • John V NP
  • John ate NP
  • John ate ART N
  • John ate the N
  • John ate the cat
  • John ate the cat
  • NAME ate the cat
  • NAME V the cat
  • NAME V ART cat
  • NAME V ART N
  • NP V ART N
  • NP V NP
  • NP VP
  • S

Constituency Parsing - Control

  • in both cases we left out how to keep track of the search space and how to make choices:
    • which node to try to expand next (DFS vs BFS)
    • which grammar rule to use to expand the node
  • one approach is called back tracking: make a choice, if it doesn’t work then back up and make a different choice

Constituency Parsing - With Backtracking

example backtracking

Given

Top-Down DFS Search

sentence:

  • 1 The 2 old 3 man 4 cried 5

lexicon:

  • the: ART
  • old: ADJ, N
  • man: N, V
  • cried: V

grammar:

  • S → NP VP
  • NP → ART N
  • NP → ART ADJ N
  • VP → V
  • VP → V NP

CURRENT-STATE | BACKUP-STATES

1. S 1 |
2. NP VP 1 |
3. ART N VP 1 | ART ADJ N VP 1
4. N VP 2 | ART ADJ N VP 1
5. VP 3 | ART ADJ N VP 1
6. V 3 | ART ADJ N VP 1 | V NP 3
7. 4 | ART ADJ N VP 1 | V NP 3
8. V NP 3 | ART ADJ N VP 1
9. NP 4 | ART ADJ N VP 1
10. ART N 4 | ART ADJ N VP 1 | ART ADJ N 4
11. ART ADJ N 4 | ART ADJ N VP 1
12. ART ADJ N VP 1 |
13. ADJ N VP 2 |
14. N VP 3 |
15. VP 4 |
16. V 4 | V NP 4
17. 5 |

Traversal Tree

top-down-traversal-tree.drawio

Problems

even with best filtering, backtracking methods are doomed because of two inter-related problems:

  • ambiguity -
  • shared sub-problems - no matter what kind of search (top-down or bottom-up or mixed):
    • we don’t want to redo work we have already done
    • unfortunately, naive backtracking will lead to duplicated work
Ambiguity