Computer Science, asked by priyanshumishra6905, 1 year ago

There are two fundamental approaches to parsing. What are they?

Answers

Answered by Nnd130905

Parsing is a process to determine how a string might be derived using productions (rewrite rules) of a given grammar. It can be used to check whether or not a string belongs to a given language. When a statement written in a programming language is input, it is parsed by a compiler to check whether or not it is syntactically correct and to extract components if it is correct. Finding an efficient parser is a nontrivial problem and a great deal of research has been conducted on parser design.

In this note basic parsing techniques are introduced with examples and some of the problems involved in parsing are discussed together with brief explanations of some of the solutions to those problems

Two basic approaches to parsing are top-down parsing and bottom-up parsing. In the top-down approach, a parser tries to derive the given string from the start symbol by rewriting nonterminals one by one using productions. The nonterminal on the left hand side of a production is replaced by it right hand side in the string being parsed. In the bottom-up approach, a parser tries to reduce the given string to the start symbol step by step using productions. The right hand side of a production found in the string being parsed is replaced by its left hand side.

Let us see how string aababaa might be parsed by these two approaches for the following grammar as an example:

$S \rightarrow aSa \mid bSb \mid a \mid b$

This grammar generates the palindromes of odd lengths.

Top-down approach proceeds as follows:

(1) The start symbol $S$ is pushed into the stack without reading any input symbol.

(2) $S$ is popped and $aSa$ is pushed without reading any input symbol.

(3) As the first a in the input is read, a at the top of the stack is popped.

(4) $S$ is popped and $aSa$ is pushed without reading any input symbol.

(5) As the second a in the input is read, a at the top of the stack is popped.

(6) $S$ is popped and $bSb$ is pushed without reading any input symbol.

(7) As the first b in the input is read, b at the top of the stack is popped.

(8) $S$ is popped and $a$ is pushed without reading any input symbol.

(9) As the unread input symbols are read, abaa in the stack is popped one by one.

Since the stack is empty when the entire input string is read, the string is found to be in the language.

If we use configuration without state, that is, (unread portion of input, stack contents), this top-down parsing can be expressed as follows:

$(aababaa, Z_{0}) \vdash (aababaa, SZ_{0}) \vdash (aababaa, aSaZ_{0})

\vdash (ababaa, SaZ_{0}) $

$(ababaa, aSaaZ_{0}) \vdash (babaa, SaaZ_{0}) \vdash (babaa, bSbaaZ_{0}) \vdash

(abaa, SbaaZ_{0}) $

$(abaa, abaaZ_{0}) \vdash (baa, baaZ_{0}) \vdash (aa,aaZ_{0}) \vdash (a,aZ_{0}) \vdash (\Lambda, Z_{0}) $

In general a PDA for top-down parsing has the following four types of transitions: (a) For each production, pop the nonterminal on the left hand side of the production at the top of the stack and push its right hand side string;

(b) Pop the stack if the top of the stack matches the input symbol being read;

(d) Go to the final state if the entire input has been read and the stack is empty.

Bottom-up approach proceeds as follows:

(1) The string aababaa is read into the stack one by one from left until the middle a is reached.

(2) The middle a is replaced by $S$ at the top of the stack without reading any input symbol.

(3) The second b is read and pushed into the stack.

(4) $bSb$ at the top of the stack is replaced by $S$.

(5) The fourth a is read and pushed into the stack.

(6) $aSa$ at the top of the stack is replaced by $S$.

(7) The last a is read and pushed into the stack.

(8) $aSa$ at the top of the stack is replaced by $S$.

Since the stack has $S$ when the entire input string is read, the string is found to be in the language.

If we use configuration without state, this bottom-up parsing can be expressed as follows:

$(aababaa, Z_{0}) \vdash (ababaa, aZ_{0}) \vdash (babaa, aaZ_{0})

\vdash (abaa, baaZ_{0}) $

$(baa, abaaZ_{0}) \vdash (baa, SbaaZ_{0}) \vdash (aa, bSbaaZ_{0}) \vdash

(aa, SaaZ_{0}) $

$(a, aSaaZ_{0}) \vdash (a, SaZ_{0}) \vdash (\Lambda, aSaZ_{0}) \vdash (\Lambda, SZ_{0}) $

Note that the rightmost symbol on the right hand side of a production appears at the top of the stack.

Previous Question

Next Question