|
formal grammar (grammar) |
|
|---|---|
|
formal language/expressions |
|
Formal Grammars - Formal Definition
a formal grammar G is defined as the tuple (N, Σ, P, S):
-
N - a finite set of nonterminal symbols, that is disjoint with the strings formed from G
-
Σ - a finite set of terminal symbols that is disjoint from N
-
P - (the grammar) a finite set of production rules, each rule of the form:
- (Σ ∪ N)* N (Σ ∪ N)* → (Σ ∪ N)*
where:that is, each production rule maps from one string of symbols to another, where the first string (the “head”) contains an arbitrary number of symbols provided at least one of them is a nonterminal. In the case that the second string (the “body”) consists solely of the empty string—i.e., that it contains no symbols at all—it may be denoted with a special notation (often Λ, e, or ϵ) in order to avoid confusion.
- (Σ ∪ N)* N (Σ ∪ N)* → (Σ ∪ N)*
-
-
- * is the Kleene star operator
- ∪ denotes set union
-
-
S - a distinguished symbol in N that is the start symbol, also called the sentence symbol
Formal Grammars - Example
Click here to expand...
Consider the grammar G where:
- N = {𝑆, 𝐵}
- Σ = {𝑎, 𝑏, 𝑐}
- P consists of the following production rules:
- 𝑆 → 𝑎𝐵𝑆𝑐
- 𝑆 → 𝑎𝑏𝑐
- 𝐵𝑎 → 𝑎𝐵
- 𝐵𝑏 → 𝑏𝑏
- S = 𝑆
This grammar defines the language L(G) = {𝑎𝑛𝑏𝑛𝑐𝑛 | 𝑛 ≥ 1}. Thus, the language is the set of strings that consist of 1 or more 𝑎‘s, followed by the same number of 𝑏‘s, followed by the same number of 𝑐‘s.
Formal Grammars - Mathematical Constructs
Click here to expand...
given a formal grammar G = (N, Σ, P, S):
- the binary relation(pronounced as “G derives in one step”) on strings x and y in (Σ ∪ N)* is defined
List indent undo
[!list-indent-undo] [!indent]
- the binary relation(pronounced as “G derives in zero or more steps”) is defined as the reflexive transitive closure of
List indent undo
List indent undo
- sentential form - can be derived in a finite number of steps from the start symbol S; that is, a sentential form is a member of
List indent undo
- sentence - a sentential form that contains no nonterminal symbols
- L(G)
(pronounced as “language of G”) - is defined as all those sentences that can be derived in a finite number of steps from the start symbol S; that is, the setList indent undo
Formal Grammars - Chomsky Hierarchy
Click here to expand...
Link to originalChomsky Hierarchy
- is a containment hierarchy of classes of grammars
- used to understand the complexity of Human Languages
syntax definitions:
- 𝐴,𝐵 - non-terminals
- 𝑎,𝑏,𝑐 - terminals
- 𝛼,𝛿,𝛽 - strings of terminals and/or nonterminals (MAYBE empty)
- 𝛾 - a string of terminals and/or nonterminals (MUST be nonempty)
- 𝜖 - an empty string (i.e. the string of length 0)
Chomsky Type
- Formal Grammar
- Formal Language/Expressions
Automaton Accepting Formal Grammar/Language
Grammar Production Rules
Example Languages
Type-4
- Finite Enumerable Grammar (FEG)
- Finite Enumerable Language (FEL)
Memory Lookup
are defined by rules of the forms:
- 𝑆 → 𝑠𝑡𝑟𝑖𝑛𝑔-𝑜𝑓-𝑡𝑒𝑟𝑚𝑖𝑛𝑎𝑙𝑠
grammar rules G:
- 𝑆 → 𝑎𝑏𝑐
- 𝑆 → 𝑎𝑏
language of G:
- 𝐿 = {𝑎𝑏𝑐, 𝑎𝑏}
Type-3
- Regular Grammar (RG)
- Regular Language (RL)
regular grammar is EITHER left or right regular grammar:
- right regular grammar - is defined by rules of the forms:
- 𝐴 → 𝑎
- 𝐴 → 𝑎𝐵
- 𝐴 → 𝜖
- left regular grammar - is defined by rules of the forms:
- 𝐴 → 𝑎
- 𝐴 → 𝐵𝑎
- 𝐴 → 𝜖
grammar rules G:
- 𝐴 → 𝑎𝐵
- 𝐵 → 𝑏𝐵
- 𝐵 → 𝑐
language of G:
- 𝐿 = { 𝑎𝑏𝑛𝑐 | 𝑛 ≥ 0 }
Type-2
- Context Free Grammar (CFG)
- Context Free Language (CFL)
defined by rules of the form:
- 𝐴 → 𝛼
grammar rules G:
- 𝐴 → 𝑎𝐴𝑏
- 𝐴 → 𝜖
language of G:
- 𝐿 = { 𝑎𝑛𝑏𝑛| 𝑛 > 0 }
Type-1
- Context Sensitive Grammar (CSG)
- Context Sensitive Language (CSL)
defined by rules of the form:
- 𝛼𝐴𝛽 → 𝛼𝛾𝛽
- 𝐴 → 𝜖
the rule 𝐴 → 𝜖 is allowed if 𝐴 does not appear on the right side of any rule
grammar rules G:
- 𝑆 → 𝑎𝐵𝑆𝑐
- 𝑆 → 𝑎𝑏𝑐
- 𝐵𝑎 → 𝑎𝐵
- 𝐵𝑏 → 𝑏𝑏
language of G:
- L(G) = { 𝑎𝑛𝑏𝑛𝑐𝑛| 𝑛 > 0 }
Type-0.5
- Recursive Grammar (RG)
- Recursive Language (RL)
recursive grammar if it contains production rules that are recursive, meaning that expanding a non-terminal according to these rules can eventually lead to a string that includes the same non-terminal again. Otherwise, it is called a non-recursive grammar
Type-0
- Recursively Enumerable Grammar (REG)
- Recursively Enumerable Language (REL)
defined by rules of the form:
- 𝛼𝐴𝛽 → 𝛿
Every regular language is context-free, every context-free language is context-sensitive, every context-sensitive language is recursive, and every recursive language is recursively enumerable
Formal Grammars - Chomsky Hierarchy Extended
Click here to expand...
Link to originalChomsky hierarchy extended - an extension to Chomsky hierarchy
- Context-free grammar (CFG) –
- Constraint grammar (CG) –
- Definite clause grammar (DCG) –
- Functional unification grammar (FUG) –
- Generalized phrase structure grammar (GPSG) –
- Head-driven phrase structure grammar (HPSG) –
- Lexical functional grammar (LFG) –
- Probabilistic context-free grammar (PCFG) – another name for stochastic context-free grammar.
- Stochastic context-free grammar (SCFG) –
- Systemic functional grammar (SFG) –
- Tree-adjoining grammar (TAG) –




