COMP15 Project #4: RPN evaluation

COMP 15 Project #4: RPN evaluation

COMP15 Spring 20XX
Due: Friday, March 11, 6:00PM

Overview

Reverse Polish Notation (RPN) was devised by Polish logician Jan Lukasiewicz as a way to represent arbitrary arithmetic expressions without the use of parentheses. RPN is also called "postfix notation," because the operators are written after the operands to which the operators are applied. We are already familiar with the idea of RPN (postfix), having used a form of RPN to evaluate stack-based expressions in the last assignment. But, just for the sake of review, the ordinary infix expression:

    (4 + 6) * 14

when rewritten in RPN is:

    4 6 + 14 *

In this assignment, we will use PostScript-style operator names (e.g., "add" instead of "+") for the arithmetic operators. So, we would write this expression as:

    4 6 add 14 mul ;

in PostScript-style RPN.

The goal of this project is to construct a program to read a series of RPN expressions from a file and evaluate them. This project reuses the evaluation stack and array stack classes from project #3. You will need to develop a new class, LexicalAnalyzer, that reads and tokenizes the RPN expressions from the input file. The main program calls the lexical analyzer to obtain tokens, and after determining the token type, will:

Push operands on the evaluation stack,
Perform a stack operation by:
- Calling the appropriate method function to perform an arithmetic operation,
- Calling the appropriate method function to show the current top of stack (the result) or the entire contents of the stack.

Each expression is independent and the evaluation stack must be cleared after an expression is evaluated.

Objectives

The objectives for this project are:

To construct a program to read and evaluate RPN expressions.
To construct a reusable lexical analysis module.
To develop skill with C strings, pointers and character manipulations.
To gain further experience with software reuse.

The lexical analysis module will be reused in projects #5 and #6.

What needs to be done

Several files are provided:

The file lexan.h is the declaration of the LexicalAnalyzer class and its interface.
The file lexan.cpp is the implementation of the LexicalAnalyzer class.
The file rpnmain.cpp is the main program needed to read and evaluate RPN expressions.
The makefile builds everything and creates two executable programs:
- Program lexantest performs basic testing of the lexical analyzer.
- Program rpneval reads RPN expressions from a file and uses an EvalStack to evaluate them.
The files testfile1, testfile2, and testfile3 are read by lexantest.
The file rpn contains a few sample RPN expressions.

These files are available in the directory /g/15/class/project4. You will need to copy these files to your own working directory and use them as a starting point for development. You must also reuse the evaluation stack and array stack classes that you developed in project #3.

You will need to implement the LexicalAnalyzer class (lexan.cpp.) An informal interface specification for this module is given below. The lexical analyzer recognizes five kinds of tokens:

    Token         Meaning
    ------------  -----------------------------------------------
    UNDEFINED     Undefined token (unrecognizable input)
    END_OF_INPUT  End of input has been reached
    DELIMITER     A single punctuation character (e.g., operator)
    IDENTIFIER    A name (e.g., an operator name)
    NUMBER        A number (e.g., an operand)

The lexical analyzer should detect the end-of-input condition and return the token END_OF_INPUT. The evaluator part of the program must detect END_OF_INPUT and terminate cleanly.

An IDENTIFIER is a letter followed by zero or more alphanumeric characters. An alphanumeric character is either a letter or a digit. Thus, an IDENTIFIER is terminated by any character which is not an alphanumeric.

A DELIMITER is a punctuation character. The evaluator part of the program only uses the delimiter characters '=' and ';'. You may ignore any other delimiters. This should simplify error handling.

A NUMBER is

A minus sign followed by one or more digits, or
A digit character followed by zero or more digits.

You must convert the digit string to its internal binary representation. Use the function strtod in the library cstdlib to perform this conversion. The function strtod takes a pointer to a digit string as its first argument and returns the binary floating point value of the digit string:

    #include <cstdlib>
    ...
    int value ;
    char number[64] ;
    ...
    value = strtod(number, (char **)NULL) ;

You can find out more information about strtod or any other standard C function by using the man command, e.g., man strtod.

Here are more details on the syntax of the RPN expressions. An RPN expression consists of operands and operators. Operands are decimal integer numbers (positive and negative.) The operators are:

    Operator   Meaning
    --------   ------------------------------------------------
      add      Addition
      sub      Subtraction
      mul      Multiplication
      div      Division
      neg      Arithmetic negative
      abs      Absolute value
      sqroot   Square root
      =        Show contents of stack (no pop!)
      ;        Show top of stack and clear; ends the expression

The order of operands for non-commutative operations (div, mod) is the same as the evaluation stack. The '=' operator may appear anywhere in an expression and displays the current contents of the stack without affecting the stack contents in any way. Thus, the '=' operator is handy for debugging problems with expressions. The ';' operator terminates an expression and separates the current expression from the next expression in the file. It shows the top of the stack (the final result) and clears the stack to prepare for the next expression to be evaluated. Here are two example expressions:

    10 20 30 add add ;
    10 30 30 40 = add add mul ;

and sample output after evaluation:

    60
    40 30 30 10 <>
    1000

Here are some additional requirements to be satisfied:

Your program must take a single command line argument which is the name of the file containing the RPN expressions.
You must use the supplied makefile to build your solution. Do not modify this makefile.
You should thoroughly and systematically test the lexical analyzer as you will be reusing this module in later assignments.

Add comments to the source code to describe your approach and solution. Don't forget to fill out any header comments with your name, section and e-mail address.

On-line resources

As usual, the class notes have vital information for development. All course notes are available through the "life line" URL:

    http://www.eecs.tufts.edu/comp/15/schedule.html

You will also need information on the cctype and cstring libraries at:

    http://www.eecs.tufts.edu/comp/15/cref/cref.html

The cctype library includes functions such as isalpha, isalnum, isdigit, ispunct, and isspace to easily classify characters. Appendix D.2 of the textbook briefly describes the cctype and cstring libraries as well.

The on-line quick reference on C++ file streams has information on file input:

    http://www.eecs.tufts.edu/comp/15/cref/file.html

Sample code is given for reading characters from a file -- feel free to borrow and adapt the sample code as needed. Remember, it will be easier to build your lexical analyzer if you peek ahead in the input stream with the peek() method. Chapter 5 of the textbook also discusses C++ file I/O.

Submitting your work

Use the provide system to submit your finished program:

    /local/bin/provide comp15 p4 lexan.h lexan.cpp ...

while logged into comp15.eecs.tufts.edu. The files to be submitted are:

    array_stack.h    Declaration of stack interface
    array_stack.cpp  Definition of stack
    eval_stack.h     Declaration of evaluation stack interface
    eval_stack.cpp   Definition of evaluation stack
    lexan.h          Declaration of the lexical analysis interface
    lexan.cpp        Implementation of the lexical analyzer
    rpnmain.cpp      Evaluation driver
    makefile         makefile to build everything

Be sure to build and test your program on comp15.cs.tufts.edu before submitting your work. This system is the target build and test platform for all course projects. (And hey, have you experienced the load on the Sun servers lately? :-)

Extra credit

Here are two additional requirements for extra credit:

Detect and recover from ill-formed RPN expressions (e.g., underflow.)
Extend the lexical analyzer to handle floating point numbers with fractional parts and exponents.

Informal interface specification: LexicalAnalyzer

Function: LexicalAnalyzer
  Purpose: Constructor for lexical analyzer objects; Initializes
    private data members
  Parameters:        None
  Returns:           Nothing
  Error conditions:  None

Function: LexicalAnalyzer::open
  Purpose: Opens the specified input file and advances to the first token
  Parameters:
    filename: Name of the input file
  Returns:
    True if open was successful, false on failure
  Error conditions:
    Unable to open input file

Function: LexicalAnalyzer::close
  Purpose: Closes the input file
  Parameters:       None
  Returns:          Nothing
  Error conditions: None

Function: LexicalAnalyzer::advance
  Purpose: Advances to the next token; reads and recognizes
    the next token from the input file stream
  Parameters:
    None
  Returns:
    True if scan was successful; False if end-of-input has already
    been scanned and an attempt was made to advance past the end-of-input
  Error conditions:
    None

Function: LexicalAnalyzer::matches
  Purpose: Compares the current lexeme with a pattern string
  Parameters:
    pattern: Pattern string to be compared
  Returns:
    True if the current lexeme and the pattern string are the same;
    otherwise, false
  Error conditions:
    None

Function: LexicalAnalyzer::getToken
  Purpose: Returns the type of the current token
  Parameters:
    None
  Returns:
    Returns the type of the current token; one of the values of the
    enumeration type TokenType
  Error conditions:
    None

Function: LexicalAnalyzer::isDelimiter
  Purpose: Tests if current token is a delimiter token
  Parameters:
    None
  Returns:
    True if the current token is a delimiter, else false
  Error conditions:
    None

Function: LexicalAnalyzer::isIdentifier
  Purpose: Tests if current token is an identifier token
  Parameters:
    None
  Returns:
    True if the current token is an identifier, else false
  Error conditions:
    None

Function: LexicalAnalyzer::isNumber
  Purpose: Tests if current token is a number token
  Parameters:
    None
  Returns:
    True if the current token is a number, else false
  Error conditions:
    None

Function: LexicalAnalyzer::isEOI
  Purpose: Tests if current token is the end-of-input token
  Parameters:
    None
  Returns:
    True if the current token is the end-of-input, else false
  Error conditions:
    None

Function: LexicalAnalyzer::getDelimiter
  Purpose: Copies string value of lexeme to the destination string
  Parameters:
    destination: Destination for lexeme value; destination string
      must be big enough to hold the lexeme
  Returns:
    Nothing
  Error conditions:
    None

Function: LexicalAnalyzer::getIdentifier
  Purpose: Copies string value of lexeme to the destination string
  Parameters:
    destination: Destination for lexeme value; destination string
      must be big enough to hold the lexeme
  Returns:
    Nothing
  Error conditions:
    None

Function: LexicalAnalyzer::getNumber
  Purpose: Converts lexeme to internal binary representation
    and return its value; Note: getNumber used the library
    function "strtod" to do the conversion
  Parameters:
    None
  Returns:
    Floating point (double) value of the current numeric lexeme
  Error conditions:
    None

Function: LexicalAnalyzer::printToken
  Purpose: Displays useful debugging information about the current token;
    Displays token type, lexeme, etc.
  Parameters:       None
  Returns:          Nothing
  Error conditions: None