CSCI 435: Compiler Construction
Assignment 5

Overview

Instead of using Flex to generate a C- lexer, write a Lexer class by hand. As before, have your driver repeatedly invoke your lexer and print the tokens in a NEAT tabular format. Report any invalid input, along with the line and column numbers (start line and column numbers with 1). It is OK if the column numbers are SLIGHTLY off.

You could code your lexer using the DFA approach we discussed in class (and the one your author uses for his TINY getToken function; see Fig 2.10 in the text for the DFA). Alternatively, you could code your lexer in a more ad hoc fashion, which, if done with care, will result in cleaner code. For example, code for member function "getToken" (which replaces "yylex") could be SIMILAR to

  Token
  getToken ()
  {
    getChar ();
    // Eat whitespace, but keep track of line and column #'s
    eatWhitespace ();
    ...
    switch (nextChar)
    {
    case EOF:
      return Token (END_OF_FILE);

    case ';':
      return Token (SEMI, ";");

    case '-':
      getChar ();
      if (nextChar != '-')
      {
        ungetChar ();
        return Token (MINUS, "-");
      }
      return Token (DECREMENT, "--");

    ...
    }
    Token token;
    if (isalpha (nextChar))
      lexId (token);
    ...
    return token;
  }

Use this header file Lexer.h as a basis for your Lexer class. Use "Lexer.cc" as the implementation file. You may modify the class to suit your needs. However, do NOT change the definitions of "TokenType" and "Token".

Input Specification

Take the name of a C- program as a command-line argument. If no argument is specified read from standard input. Use ".cm" as the extension for C- source files.

Output Specification

Use the same specifications as in Assignment 4. For LexerTest.cm the output should be LexerOutput.txt.

Submission

Submit your driver file ("CMinus.cc"), "Lexer.cc", header file(s), and Makefile. ENSURE your Makefile is named "Makefile" and that it properly builds your program, and that "make clean" removes the executable and all generated files.

Hints

Handle keywords similar to identifiers. When an identifier is recognized, look it up in a keyword map. If it's found, return the corresponding keyword token, otherwise return an identifier token.


Gary M. Zoppetti, Ph.D.