In this practical, you will learn to reshape trees.
Your work will be assessed on the basis of the correctness of the functions you implement. Your code should have the structure described in the section Practicalities, and be placed in the file CS201/Prac4/Optimise.ML under your home directory. The assignment of marks to the various components of the exercise is given, as comment, in the signature OptimiseSig. This weighting is not intended to reflect the relative difficulty of implementing the various functions; it is designed to ensure that those who do a reasonable job on the heavily weighted functions get a reasonable mark, while those who seek an excellent mark have to work for it.
The revised deadline for this practical is 6.00pm, Friday 15th April.
Seldom does a programmer have the luxury of starting from a clean slate. In this practical, you will modify, and improve on, an existing system. Code for this system is given, and documented, in an appendix to this document. However, you don’t need to understand the implementation details of the code provided in order to complete the practical.
In this practical we use syntax trees to represent algebraic expressions. The existing system provides an implementation of the abstract syntax of expressions, and functions to compile, and execute, code to evaluate an expression. It uses a stack-based, evaluator; “compilation” is accomplished by post-fix traversal of the syntax tree.
The stack code produced for an expression may be unneccessarily inefficient. Evaluating the stack code for an expression may require a deeper or shallower stack, depending on the way the expression is written. Expressions involving only constants may be evaluated, once and for all, at ‘compile time; our code generator produces code to evaluate them at run time. Algebraic manipulation of the expression, before compilation, could lead to better code.
Your task is to apply simple algebraic transformations to the syntax tree, before passing it to the compiler, in order to optimise the code produced. You should perform four optimisations, in turn: reshaping, constant amalgamation, constant elimination, re-ordering. These are described individually below.
Before you start coding your solutions, you should make sure you understand what is required. To consolidate your understanding, draw diagrams of the trees involved, for some simple examples.
You should write a function reshape: Expn -> Expn to implement this optimisation. Your function should apply one of the left-rotations1 :
Our next two optimisations cooperate to combine and eliminate constants. Since constants may be separated in the tree, as in the example just given, we may have to collect them together before performing any arithmetic. Your second optimisation should be a function that collects together and amalgamates constants.
You should write a function amalgam: Expn -> Expn. Your implementation should assume that sequences of additions or multiplications are associated to the left; later you will use the optimisation of the previous section to reshape the tree before applying this function. The following rules2 , applied top-down, will amalgamate multiple constants, occurring on the right-hand-side of an operator, in a sequence of multiplications, and push the product down the tree:
The commutative laws for addition and mutiplication
You should write a function rightHeight : Expn -> int to compute the right-height of a syntax tree, and another reorder : Expn -> Expn that applies the transformations given above, bottom-up, whenever the right-height of x is less than the right-height of y. This means that you should recursively apply the transformation to the two subtrees before seeing if you need to adjust a node.
Finally, you should combine your optimisations into a single transformation, using the ML infix operator o for function composition:
If you run ML with the command ml prac4, the identifiers ++ and ** will be set up as infix, with their usual precedences. But, in order to make it easy to exercise your optimisations, they have been made right-associative. All the structures documented in the appendix are predefined in the prac4 database.
As usual, a signature, OptimiseSig, (see Figure 1) has been given for the code you are asked to write. You should place your code in a structure Optimise:OptimiseSig in a file Prac4/Optimise.ML. Any functions you have been unable to implement should be replaced by dummies of the correct type.
This appendix documents the code provided for the practical. The information provided here goes beyond what you will need to complete the practical, but it should be of general interest.
The code provided has four main components:
These components are used by the structure TopLevel to implement a simple, but fairly powerful, expression evaluator, that will compile a list of declarations.
The abstract syntax provides for algebraic expressions in + and ×, with integer constants, and arbitrary strings as identifiers.
|
The values associated with identifiers will be stored in a datatstructure called the environment. The signature EnvironmentSig, see Figure 2, provides an interface to the structure Environment, which uses an association list, a (string*int) list of pairs, each consisting of a string and the associated integer value, as an underlying datastructure. This implementation has, intentionally, been made transparent, in order that you can see the effects of declarations as they are made.
We will use the value empty to represent a new environment, the function enter to add new bindings to an environment, and the function lookup to find the value associated with a given string.
|
The structure Machine, given in Figure 3, provides a model for a stack-based evaluator. The machine has four actions: we can push a literal, or the value of an identifier, onto the stack, or apply one of the arithmetic operations, +, and × to the top two elements of the stack.
Code for the evaluator consists of a list of actions. The function run is constructed by specifying the state transition corresponding to the execution of each Action. The components of the state are: args, an argument stack; and code, a list of actions. The environment, needed by the machine to look up the values of identifiers, is passed as a parameter, env.
To execute a given code, we perform each of the actions in turn, starting with an empty stack. Running code compiled from a syntax tree should leave a single value on the stack. This value is returned as the result.
The function, code, produces the stack code for a given expression. It is based on the post-order traversal of a binary tree described in the notes.
The structure TopLevel uses the evaluator to compile and run a sequence of declarations. As an example of its use, consider the following ML code
Running this produces the output
Compare this with the following ML code:
which produces the final response
> val it = (24, 5, 20) : int * int * int
|
Interaction with the ML system generates an environment in which future declarations are evaluated, just like the compile function provided by the structure TopLevel. The similarity goes much deeper; by choosing the oppropriate environment for evaluating the code for an expression, we can implement let expressions and functions. The body of a let expression is evaluated in an environment including the bindings generated by the local declarations, but these are not added to the top-level environment. When we apply a function, the body is evaluated in an environment which binds the formal parameters to the values of the actual parameters. Taking these ideas a litte further would allow us to implement recursive functions, and curried functions. But we are already well away from the substance of the practical.