COMP 15 Project #6: Build and traverse expression tree

COMP15 Spring 20XX
Due: Monday, April 11, 11:00PM

Overview

A compiler reads and parses a source language program in order to generate executable machine code. Often, the source language program is translated to an intermediate representation which is analyzed and optimized before machine instructions are generated.

The goal of this project is a program, rpntree, to read and parse Reverse Polish Notation (RPN) arithmetic expressions, build a binary tree intermediate representation of each expression, and traverse each expression tree to print an expression in prefix, infix and postfix forms as selected by the user on the command line. Expressions are printed to a file.

Objectives

The objectives for this project are:

The command line interface

When the program rpntree is launched, the command line options and arguments specify:

The command line interface for rpntree adheres to the following usage synopsis:
    rpntree [-prefix|-infix|-postfix] rpnfile outfile
The program name rpntree is followed by one or more options: -prefix, -infix, -postfix. The name of the input RPN expression file follows the print options on the command line. The name of the input file is followed by the name of the expression output file.

The rpntree program must check several command line and user interface conditions before expression processing is undertaken:

When an error is detected, the program must display a helpful error message followed by a synposis for rpntree usage:
    Usage: rpntree [-prefix|-infix|-postfix] rpnfile outfile
The program should then exit and return the status code EXIT_FAILURE. When rpntree runs successfully, it must return the status code EXIT_SUCCESS.

RPN expression syntax

RPN expressions in the input file differ from the syntax used in earlier assignments. The delimiter/operator '=' (equal sign) has been eliminated. The delimiter ';' (semicolon) is not treated as an operator. The delimiter ';' terminates an RPN expression and it separates RPN expressions in the input file. An RPN expression is free-format and may be split across multiple input lines.

RPN expressions have two kinds of operators: monadic and dyadic. A monadic operator takes one operand and a dyadic operator takes two operands. Monadic operators are:

    abs       Absolute value
    neg       Arithmetic negation
    sqroot    Square root
Dyadic operators are:
    add       Addition          '+'
    sub       Subtraction       '-'
    mul       Multiplication    '*'
    div       Division          '/'
When an expression is printed in infix form, the corresponding single character equivalent for dyadic operators must be used in place of add, sub, etc. to improve readability.

Operands in RPN expressions may be either numeric or symbolic. Numeric operands are positive or negative integer numbers. Symbolic operands are C-like identifiers.

Expression output format

Expressions printed to the output file must adhere to the rules given in this section. Each output expression is printed on a separate line and must begin with a label that identifies the output format, i.e.,

    Prefix:           For prefix output expressions
    Infix:            For infix output expressions
    Postfix:          For postfix output expressions
If an input file contains more than one RPN expression, then the output expressions for a given input expression should appear together in the the output file. (The program rpntree only needs to build and process one expression at a time and then discard the expression tree when done with it.) A blank line should separate each group of output expressions for readability.

Prefix form should always appear before infix form and infix form should always appear before postfix form. The order is not sensitive to the order of the options in the command line. (Let's keep it simple!)

Here's an example to illustrate. Assume that the input file "expr5" contains:

    -10 -20 -30 add add;aaa bbb ccc add add ;
    xxx -37 zzz
    add add ;
When rpntree is run using the command:
    rpntree -prefix -infix -postfix expr5 out5
the output file "out5" contains:
    Prefix: (add -10 (add -20 -30))
    Infix: (-10 + (-20 + -30))
    Postfix: -10 -20 -30 add add 

    Prefix: (add aaa (add bbb ccc))
    Infix: (aaa + (bbb + ccc))
    Postfix: aaa bbb ccc add add 

    Prefix: (add xxx (add -37 zzz))
    Infix: (xxx + (-37 + zzz))
    Postfix: xxx -37 zzz add add 
Note that a blank line separates each "group" of output expressions.

Each print format has some additional rules:

The terminating semicolon is not shown.

Where to start

Several files are provided to help you get started:

All files are available in the directory /g/15/class/project6.

What needs to be done

You will need to complete the main module, rpntree.cpp. The main function must:

You should use the file rpnmain.cpp from project #4 as a rough guide/starting point.

The expression tree object is created, maintained and traversed by the ExprTree class. You must implement this class using the class declaration in expr_tree.h and in accordance with the informal specification given below.

In ExprTree, you will need to write three public functions -- printPrefix, printInfix and printPostfix -- which traverse the expression tree in preorder, inorder and postorder, respectively. Each traversal function should display a node such that the resulting output will produce the expression tree in prefix, infix and postfix notation. Please pay attention to formatting and try to make the resulting output as readable as possible. The public functions call inner, private functions to recursively walk the tree in the appropriate order.

The expression tree class uses a stack object to keep track of intermediate RPN subexpressions. The stack object is an instance of the Stack class. You must implement this class using the class declaration in expr_stack.h and in accordance with the informal specification given below.

Both ExprTree and Stack use objects from the ExprNode class. The declaration and definition of this class are given in the files expr_node.h and expr_node.cpp. You may not need to modify these files, but you will need to read through them to understand the ExprNode class.

Here are some additional considerations and requirements to be satisfied:

Don't forget to fill out any header comments with your name, section and e-mail address.

Submitting your work

Use the provide system to submit your finished program:

    /local/bin/provide comp15 p6 makefile expr_node.h ...
The files to be submitted are:
    expr_node.h       Declaration of ExprNode interface
    expr_node.cpp     Definition of ExprNode
    expr_stack.h      Declaration of Stack interface
    expr_stack.cpp    Definition of Stack
    expr_tree.h       Declaration of ExprTree interface
    expr_tree.cpp     Definition of ExprTree
    lexan.h           Declaration of the LexicalAnalyzer interface
    lexan.cpp         Definition of the LexicalAnalyzer class
    rpntree.cpp       Driver to handle the CLI and parse RPN expressions
    makefile          makefile to build everything
Be sure you are completely satisfied with your work before submitting! Build and test your program on comp15.cs.tufts.edu before submitting your work. This system is the target build and test platform for all course projects. Projects which do not build correctly receive a score of zero.

Informal interface specification: ExprNode

An ExprNode represents a node in an expression tree. A node may stand for either: an operator taking one operand (MONADIC_OPERATOR_NODE), an operator taking two operands (DYADIC_OPERATOR_NODE), a numeric operand (NUMBER_NODE), or a symbolic operand (IDENTIFIER_NODE.)

Function: ExprNode()
Purpose: Constructor that creates and initializes a new
  ExprNode object. Initially, the ExprNode is an UNDEFINED_NODE,
  the symbol values are undefined (NULL) and the left and right
  subtrees are empty (NULL.)
Parameters:
  None
Returns:
  Nothing
Error conditions:
  None

Function: debugPrint()
Purpose: Displays the information in an ExprNode in a
  format suitable for debugging.
Parameters:
  None
Returns:
  Nothing
Error conditions:
  None

Informal interface specification: Stack

A Stack object is used to hold pointers to ExprNode objects as an RPN expression is being parsed.

Function: Stack()
  Purpose: 
    Constructor that initializes the Stack object to its
    empty state
  Parameters:
    None
  Returns:
    Nothing
  Error conditions:
    None

Function: ~Stack()
  Purpose: 
    Deallocates (deletes) any items in the stack
  Parameters:
    None
  Returns:
    Nothing
  Error conditions:
    None

Function: clear()
  Purpose: 
    Deallocates (deletes) any items in the stack and
    resets it to its empty state
  Parameters:
    None
  Returns:
    Nothing
  Error conditions:
    None

Function: push(PtrExprNode node)
  Purpose: 
    Push a pointer to an ExprNode on the stack
  Parameters:
    node: Pointer to be pushed
  Returns:
    True if push was successful; false on failure (error condition)
  Error conditions:
    Stack overflow; Prints error message to cerr on error

Function: pop()
  Purpose: 
    Pops the pointer to an ExprNode on top of the stack
    and returns it
  Parameters:
    None
  Returns:
    Pointer to ExprNode on top of stack; NULL on failure
    (error condition)
  Error conditions:
    Stack underflow; Prints error message to cerr on error

Function: getTop()
  Purpose: 
    Returns the pointer to ExprNode on top of the stack
    without popping it
  Parameters:
    None
  Returns:
    Pointer to ExprNode on top of stack; NULL on failure
    (error condition)
  Error conditions:
    Stack is empty; Prints error message to cerr on error

Function: isEmpty()
  Purpose: 
    Detect if stack is empty or not
  Parameters:
    None
  Returns:
    True if the stack is empty; otherwise, false
  Error conditions:
    None

Function: getDepth()
  Purpose: 
    Determine the number of items on the stack
  Parameters:
    None
  Returns:
    The number of items on the stack
  Error conditions:
    None

Function: showTop()
  Purpose: 
    Display information about the ExprNode on top
    of the stack. Calls debugPrint method of ExprNode
    to display node information. Displays "<>" when the
    stack is empty.
  Parameters:
    None
  Returns:
    Nothing
  Error conditions:
    None

Function: showAll()
  Purpose: 
    Display information about the ExprNode objects on
    the stack. Calls debugPrint of ExprNode to display
    information for each node from top to bottom.
    Displays "<>" to represent the bottom of the stack.
  Parameters:
    None
  Returns:
    Nothing
  Error conditions:
    None

Informal interface specification: ExprTree

Function: ExprTree()
  Purpose: 
    Constructor creates an empty expression tree
  Parameters:
    None
  Returns:
    Nothing
  Error conditions:
    None

Function: ~ExprTree()
  Purpose: 
    Destructor discards (releases) storage
  Parameters:
    None
  Returns:
    Nothing
  Error conditions:
    None

Function: start()
  Purpose: 
    Start processing an RPN expression. This function is
    called when starting a new RPN expression in order
    to build a new expression tree. Clears the node stack.
  Parameters:
    None
  Returns:
    Nothing
  Error conditions:
    None

Function: finish()
  Purpose: 
    Finish processing an RPN expression. This function is
    called after an RPN expression has been completely
    parsed. Node stack should contain exactly one node.
    Pop the pointer to the node and make the node the root
    of the expression tree. Displays error messages to cerr
    when necessary.
  Parameters:
    None
  Returns:
    Nothing
  Error conditions:
    Stack is empty or does not contain exactly one node

Function: monadicOperator(char *op)
  Purpose: 
    Process a monadic operator. Create a new operator
    node and initialize it. Pop node from stack and make
    that node the right expression subtree. Push new
    operator node.
  Parameters:
    op: Pointer to operator name (lexeme)
  Returns:
    True if successful; false if stack is empty (no right
    subexpression)
  Error conditions:
    Stack is empty

Function: dyadicOperator(char *op)
  Purpose: 
    Process a dyadic operator. Create a new operator
    node and initialize it. Pop two nodes from stack and
    makes them the left and right expression subtrees.
    Push new operator node.
  Parameters:
    op: Pointer to operator name (lexeme)
  Returns:
    True if successful; false if stack is empty or has
    only one node in it (no right or left subexpression)
  Error conditions:
    Stack is empty or has only one node in it

Function: number(double num)
  Purpose: 
    Process a numeric operand. Create a new numeric
    node and initialize it. Push it on the stack.
  Parameters:
    num: Floating point value of numeric operand
  Returns:
    True if successful; false on failure
  Error conditions:
    Stack overflow

Function: identifier(char *id)
  Purpose: 
    Process a symbolic operand. Create a new symbolic
    operand node and initialize it. Push it on the stack.
  Parameters:
    id: Pointer to operand name (lexeme)
  Returns:
    True if successful; false on failure
  Error conditions:
    Stack overflow

Function: printPrefix(ofstream &out)
  Purpose: 
    Print expression tree in prefix form. Invoke
    a preorder tree traversal.
  Parameters:
    out: Output file stream (reference parameter)
  Returns:
    Nothing
  Error conditions:
    None

Function: printInfix(ofstream &out)
  Purpose: 
    Print expression tree in infix form. Invoke
    a inorder tree traversal.
  Parameters:
    out: Output file stream (reference parameter)
  Returns:
    Nothing
  Error conditions:
    None

Function: printPostfix(ofstream &out)
  Purpose: 
    Print expression tree in postfix form. Invoke
    a postorder tree traversal.
  Parameters:
    out: Output file stream (reference parameter)
  Returns:
    Nothing
  Error conditions:
    None

Copyright © 2004-2013 Paul J. Drongowski