CSCI 435: Compiler Construction
Assignment 7

Overview

Write a PREDICTIVE (no backtracking), recursive descent parser for the grammar below. EBNF is used, so be careful to note whether a symbol is a metasymbol or not. The six valid metasymbols are the following: {, }, [, ], (, and ). A symbol is NOT a metasymbol if it is in single quotes (e.g., '[' and '}'). All the tokens are single characters, so the scanner is trivial. NAME your scanner function "getToken" (see Required Functions below).

In a comment above EVERY parsing method, show the rewritten GRAMMAR RULE that it parses. Use the STANDARD techniques we studied to remove left recursion and perform left factoring.

M  →   { S } '$'

S  →   I | W | A | P | C | G

I  →   '[' E ? { S } : { S } ']' | '[' E ? { S } ']'

W  →   '{' E ? { S } '}'

A  →   id = E ;

P  →   < E ;

C  →   < ( 'B' | 'T' | 'N' ) ;

G  →   > id ;

E  →   E + T | E - T | T

T  →   T * U | T / U | T % U | U

U  →   F ^ U | F

F  →   '(' E ')' | id | num

id   →   a | b | ... | z
num  →   0 | 1 | ... | 9

Grammar Key

Non-terminalMeaning
MMain Function
SStatement
IIf-Then-[Else]
WWhile
AAssignment
PPrint Integer
CPrint Character
GGet Integer
EExpression
TTerm
UUp-raised Term
FFactor

Terminals B, T, and N, stand for blank, tab, and newline, respectively. An identifier (id) must be a SINGLE LOWERCASE letter. A number (num) must be a SINGLE DIGIT. A valid program must end in '$'.

Input Specification

Input a string from the user (one character at a time). Use whitespace as a delimiter (" \t\r\n").

Output Specification

Indicate whether the string is valid or not. Detect an error and output an error message indicating ALL of the following:

Use the following error format:
Error in "<function>" on line <lineNum>:
  expected '<expectedToken>' but got '<currentToken>'. 

If the string is valid output "Valid!".

Sample Input/Output 1
[ a ?
    b  = 5 ;
]  $

Valid!
Sample Input/Output 2
> x;
> y;

< (x * y)
    + 3
    % (y ^ 2);

{ x &
    x = x - 1;
}
$
  
Error in "W" on line 8:
  expected '?' but got '&'. 

Required Functions

// Read and return a character token.
// Skip whitespace and keep track of line number. 
char
getToken ();
  
// Check if the current token matches "expectedToken".
// If so, update the current token.
// If not, call the "error" function. 
// "function" is the function which called "match".
void
match (const string& function, char expectedToken);

// Output an error message using the appropriate format.
// "function" is the function in which the error occurred. 
void
error (const string& function, const string& expected);

Submission

Submit your parser file.

Hints

The part of the grammar involving "{ S }" means "zero or more statements". In handling such a construct, you need to know when to stop calling function S, the one that handles a statement. There are two methods to do this: call S as long as the next token is one that can start a statement. The other (equivalent) method is to keep calling S until the next token is one that may end a sequence of statements.

See the RecursiveDescent directory for examples and some test cases.


Gary M. Zoppetti, Ph.D.