Context Free Grammars

Context-free grammars (CFGs) are used to describe context-free languages. A context-free grammar is a set of recursive rules used to generate patterns of strings. A context-free grammar can describe all regular languages and more, but they cannot describe all possible languages.

Context-free grammars are studied in fields of theoretical computer science, compiler design, and linguistics. CFG’s are used to describe programming languages and parser programs in compilers can be generated automatically from context-free grammars.

Two parse trees that describe CFGs that generate the string

Two parse trees that describe CFGs that generate the string "x + y * z". Source: Context-free grammar wikipedia page.

Contents

Context Free Grammars

Context-free grammars can generate context-free languages. They do this by taking a set of variables which are defined recursively, in terms of one another, by a set of production rules. Context-free grammars are named as such because any of the production rules in the grammar can be applied regardless of context—it does not depend on any other symbols that may or may not be around a given symbol that is having a rule applied to it.

For comparison, a context-sensitive grammar can have production rules where both the left-hand and right-hand sides may be surrounded by a context of terminal and nonterminal symbols.

Formal Definition

A context-free grammar can be described by a four-element tuple \((V, \Sigma, R, S),\) where

A way to condense production rules is as follows:

and translate them into a single line: S \(\rightarrow\) ( ) | SS | (S) | \(\epsilon,\) where \(\epsilon \) is an empty string.

Strings that end in 0 Strings that have an odd number of 1’s Strings that have an equal number of 1’s and 0’s Strings that end in 1 Strings that have an even number of 1’s and 0’s

What does the following context-free grammar describe?

S is the start symbol and the set of nonterminal symbols is \(\\). Here are the production rules:

Using the example above, show the steps of deriving the following expression: \((4 + 5) * (2 - 6)\). Note, there are many ways to do this, but the solution below should give you enough guidance to check if your derivation works.

Show Answer

\(\rightarrow\) 4 (using rule 1)
\(\rightarrow\) 5 (using rule 1)
\(\rightarrow\) 4 + 5 (using rule 3)
\(\rightarrow\) (4 + 5) (using rule 2)
\(\rightarrow\) 2 (using rule 1)
\(\rightarrow\) 6 (using rule 1)
\(\rightarrow\) 2 - 6 (using rule 4)
\(\rightarrow\) (2 - 6) (using rule 2)
\(\rightarrow\) (4 + 5) * (2 - 6) (using rule 5)

Context-free grammars can be modeled as parse trees. The nodes of the tree represent the symbols and the edges represent the use of production rules. The leaves of the tree are the end result (terminal symbols) that make up the string the grammar is generating with that particular sequence of symbols and production rules.

x + y * x - z * y / x + x ( x + y ) * x - z * y / ( x + x ) ( x + y ) + x - z + y - ( x + x ) x + y + z * y + x * x

What expression does this parse tree generate?

The conext-free grammar wikipedia page

The conext-free grammar wikipedia page

The parse trees below represent two ways to generate the string "a + a - a" with the grammar

\[A \rightarrow A + A | A − A | a.\]

Example of an ambiguous grammar—one that can have multiple ways of generating the same string

Example of an ambiguous grammar—one that can have multiple ways of generating the same string [2]

Because this grammar can be implemented with multiple parse trees to get the same resulting string, this is said to be ambiguous.

Relationship with other Computation Models

A context-free grammar can be generated by pushdown automata just as regular languages can be generated by finite state machines. Since all regular languages can be generated by CFGs, all regular languages can too be generated by pushdown automata.

Any language that can be generated using regular expressions can be generated by a context-free grammar.

The way to do this is to take the regular language, determine its finite state machine and write production rules that follow the transition functions.

See Also