COMP15 Spring 20XX
Due: Friday, March 11, 6:00PM
Reverse Polish Notation (RPN) was devised by Polish logician Jan Lukasiewicz as a way to represent arbitrary arithmetic expressions without the use of parentheses. RPN is also called "postfix notation," because the operators are written after the operands to which the operators are applied. We are already familiar with the idea of RPN (postfix), having used a form of RPN to evaluate stack-based expressions in the last assignment. But, just for the sake of review, the ordinary infix expression:
(4 + 6) * 14
when rewritten in RPN is:
4 6 + 14 *
In this assignment, we will use PostScript-style operator names (e.g., "add"
instead of "+") for the arithmetic operators. So, we would write this
expression as:
4 6 add 14 mul ;
in PostScript-style RPN.
The goal of this project is to construct a program to read a series of RPN expressions from a file and evaluate them. This project reuses the evaluation stack and array stack classes from project #3. You will need to develop a new class, LexicalAnalyzer, that reads and tokenizes the RPN expressions from the input file. The main program calls the lexical analyzer to obtain tokens, and after determining the token type, will:
The objectives for this project are:
Several files are provided:
You will need to implement the LexicalAnalyzer class (lexan.cpp.) An informal interface specification for this module is given below. The lexical analyzer recognizes five kinds of tokens:
Token Meaning
------------ -----------------------------------------------
UNDEFINED Undefined token (unrecognizable input)
END_OF_INPUT End of input has been reached
DELIMITER A single punctuation character (e.g., operator)
IDENTIFIER A name (e.g., an operator name)
NUMBER A number (e.g., an operand)
The lexical analyzer should detect the end-of-input condition and
return the token END_OF_INPUT. The evaluator part of the
program must detect END_OF_INPUT and terminate cleanly.
An IDENTIFIER is a letter followed by zero or more alphanumeric characters. An alphanumeric character is either a letter or a digit. Thus, an IDENTIFIER is terminated by any character which is not an alphanumeric.
A DELIMITER is a punctuation character. The evaluator part of the program only uses the delimiter characters '=' and ';'. You may ignore any other delimiters. This should simplify error handling.
A NUMBER is
#include <cstdlib>
...
int value ;
char number[64] ;
...
value = strtod(number, (char **)NULL) ;
You can find out more information about strtod or any other standard
C function by using the man command, e.g., man strtod.
Here are more details on the syntax of the RPN expressions. An RPN expression consists of operands and operators. Operands are decimal integer numbers (positive and negative.) The operators are:
Operator Meaning
-------- ------------------------------------------------
add Addition
sub Subtraction
mul Multiplication
div Division
neg Arithmetic negative
abs Absolute value
sqroot Square root
= Show contents of stack (no pop!)
; Show top of stack and clear; ends the expression
The order of operands for non-commutative operations (div,
mod) is the same as the evaluation stack. The '=' operator
may appear anywhere in an expression and displays the current contents
of the stack without affecting the stack contents in any way. Thus, the '='
operator is handy for debugging problems with expressions. The ';'
operator terminates an expression and separates the current expression
from the next expression in the file. It shows the top of the stack (the
final result) and clears the stack to prepare for the next expression
to be evaluated. Here are two example expressions:
10 20 30 add add ;
10 30 30 40 = add add mul ;
and sample output after evaluation:
60
40 30 30 10 <>
1000
Here are some additional requirements to be satisfied:
As usual, the class notes have vital information for development. All course notes are available through the "life line" URL:
http://www.eecs.tufts.edu/comp/15/schedule.html
You will also need information on the cctype and cstring
libraries at:
http://www.eecs.tufts.edu/comp/15/cref/cref.html
The cctype library includes functions such as isalpha,
isalnum, isdigit, ispunct, and isspace
to easily classify characters. Appendix D.2 of the textbook briefly
describes the cctype and cstring libraries as well.
The on-line quick reference on C++ file streams has information on file input:
http://www.eecs.tufts.edu/comp/15/cref/file.html
Sample code is given for reading characters from a file --
feel free to borrow and adapt the sample code as needed. Remember, it
will be easier to build your lexical analyzer if you peek ahead in the
input stream with the peek() method. Chapter 5 of the textbook
also discusses C++ file I/O.
Use the provide system to submit your finished program:
/local/bin/provide comp15 p4 lexan.h lexan.cpp ...
while logged into comp15.eecs.tufts.edu.
The files to be submitted are:
array_stack.h Declaration of stack interface
array_stack.cpp Definition of stack
eval_stack.h Declaration of evaluation stack interface
eval_stack.cpp Definition of evaluation stack
lexan.h Declaration of the lexical analysis interface
lexan.cpp Implementation of the lexical analyzer
rpnmain.cpp Evaluation driver
makefile makefile to build everything
Be sure to build and test your program on comp15.cs.tufts.edu
before submitting your work. This system is the target build and test
platform for all course projects. (And hey, have you experienced
the load on the Sun servers lately? :-)
Here are two additional requirements for extra credit:
Function: LexicalAnalyzer
Purpose: Constructor for lexical analyzer objects; Initializes
private data members
Parameters: None
Returns: Nothing
Error conditions: None
Function: LexicalAnalyzer::open
Purpose: Opens the specified input file and advances to the first token
Parameters:
filename: Name of the input file
Returns:
True if open was successful, false on failure
Error conditions:
Unable to open input file
Function: LexicalAnalyzer::close
Purpose: Closes the input file
Parameters: None
Returns: Nothing
Error conditions: None
Function: LexicalAnalyzer::advance
Purpose: Advances to the next token; reads and recognizes
the next token from the input file stream
Parameters:
None
Returns:
True if scan was successful; False if end-of-input has already
been scanned and an attempt was made to advance past the end-of-input
Error conditions:
None
Function: LexicalAnalyzer::matches
Purpose: Compares the current lexeme with a pattern string
Parameters:
pattern: Pattern string to be compared
Returns:
True if the current lexeme and the pattern string are the same;
otherwise, false
Error conditions:
None
Function: LexicalAnalyzer::getToken
Purpose: Returns the type of the current token
Parameters:
None
Returns:
Returns the type of the current token; one of the values of the
enumeration type TokenType
Error conditions:
None
Function: LexicalAnalyzer::isDelimiter
Purpose: Tests if current token is a delimiter token
Parameters:
None
Returns:
True if the current token is a delimiter, else false
Error conditions:
None
Function: LexicalAnalyzer::isIdentifier
Purpose: Tests if current token is an identifier token
Parameters:
None
Returns:
True if the current token is an identifier, else false
Error conditions:
None
Function: LexicalAnalyzer::isNumber
Purpose: Tests if current token is a number token
Parameters:
None
Returns:
True if the current token is a number, else false
Error conditions:
None
Function: LexicalAnalyzer::isEOI
Purpose: Tests if current token is the end-of-input token
Parameters:
None
Returns:
True if the current token is the end-of-input, else false
Error conditions:
None
Function: LexicalAnalyzer::getDelimiter
Purpose: Copies string value of lexeme to the destination string
Parameters:
destination: Destination for lexeme value; destination string
must be big enough to hold the lexeme
Returns:
Nothing
Error conditions:
None
Function: LexicalAnalyzer::getIdentifier
Purpose: Copies string value of lexeme to the destination string
Parameters:
destination: Destination for lexeme value; destination string
must be big enough to hold the lexeme
Returns:
Nothing
Error conditions:
None
Function: LexicalAnalyzer::getNumber
Purpose: Converts lexeme to internal binary representation
and return its value; Note: getNumber used the library
function "strtod" to do the conversion
Parameters:
None
Returns:
Floating point (double) value of the current numeric lexeme
Error conditions:
None
Function: LexicalAnalyzer::printToken
Purpose: Displays useful debugging information about the current token;
Displays token type, lexeme, etc.
Parameters: None
Returns: Nothing
Error conditions: None
Copyright © 2004-2013 Paul J. Drongowski