µOCCAM

CS3 Individual Programming Project
2000-2001

Phase 2

The main outcome of this second part of the project should be a working system that can compile and run µOCCAM programs. Most of the framework for this is provided for you: a parallel abstract machine, symbol table, debugger, and other utilities. What you need to put in are your lexer and parser from Phase 1; then extend them to check the µOCCAM program semantics and generate code for the abstract machine.

Getting started

As in Phase 1, you start with an extremely poor, but working, µOCCAM compiler. In this case the compiler recognises very few programs, and always writes out code that terminates immediately. You can improve on this.

The first things to do are as follows.

Fetch the source files with cp -r /home/cs3/ipp/source ~/ipp2
Protect the directory with chmod og-rwx ~/ipp2
Build the compiler with cd ~/ipp2; make

Check that it runs: type ./occam and look for an error message like this:

./occam: Unexpected command line arguments.
Usage is: ./occam [-prctd] [-w time] [-s num] sourcefile [inputfile]

Now sit down and read the µOCCAM manual from beginning to end. You won't understand it all at first sight, but you do need to go through it at least once, so that you can find your way around it during the project.

There are several compiler source files, but you can leave most of them untouched. The ones to put your own code in are lexer.l, lexer.h, parser.y, and parser.h. You might also need to adjust Makefile.

Because we are not aiming to produce an especially sophisticated compiler, it is enough to put all the program-checking and code-generating directly into the semantic actions of the parser. There will be no separate abstract syntax tree to traverse and analyse.

What to do next

As the project progresses, it is important that you have a clear plan of what you are going to do, and adapt the plan depending on how well things go. Make sure that you always keep a copy of your best version so far: you don't want to be caught at the deadline with code that doesn't compile.

Here is one possible outline of how to proceed with the project. You should add your own, more detailed, checkpoints to this. Don't attempt to do too much all at once; take small safe steps.

Fit your lexer and parser from Phase 1 into the µOCCAM compiler framework. There are two main connection points.
- The lexer should read from a file, as indicated on the occam command line. To do this, the lexer code needs to provide a function void initialise_lexer(FILE* input) that will be given a pointer to the correct input file.
- The parser should deposit its output in the global variable program_code. Code for the abstract machine is represented by a linked list of instructions, and Section 5.6 of the manual describes functions build_ccode and build_icode for building these lists.
The minimal lexer.l and parser.y files provided illustrate how to do this.
Modify your parser so that it keeps track of variable declarations, using the symbol table; don't worry for the moment about stack offsets. Your parser can then check that every variable is declared before it is used, and that variables are always used with the correct type. As well as putting names into the symbol table, remember to take them out when leaving the scope of a declaration. All this applies for declaring and using procedures too.

You can find the functions to manage the symbol table and scope of declarations in Chapter 2 of the manual. Read through all of this: there are lots of hints in it about how to do things.

Do not worry about the things that cannot feasibly be checked at compile time. For example, µOCCAM requires that only one process at a time should ever try to send on a channel, and only one can be waiting to receive. You are not expected to detect this during compilation.
Once this semantic checking is begun, it is time to start generating abstract machine code. Remember that you should build this into the semantic actions of the parser, so that each construction of the language puts together code fragments from its component parts. Read Chapter 5 and Section 2.3 carefully; these explain the abstract machine and how it uses the stack.

The first task is to work out where variables are going to be stored on the stack, and insert code to keep track of this. You can then move on to actual programs. Don't aim to do everything at once, but deal with different parts of the language piece by piece: sequential code, parallel code, procedures. When running occam there are flags for printing the code sequence, and for starting a simple debugger to step through execution; see Sections 6.1 and 5.10.

Testing

As with Phase 1, one of the most important parts of the project is to keep track of how you are doing. Test your code all the time, with as many tests as possible. Don't assume that because it passed a test last week, it will still pass it this week. Test to make sure that new changes don't break old code.

You will need to set up the ipp program again.

Enable the test program with setpath /home/cs3/ipp/bin
Put the source for your compiler in directory ipp2/
Put some sample µOCCAM programs in directory ipptests/
Run the tests with ipp -2 (or just ipp2). See what it prints out, and check the detailed ipplog.

More information is provided on the ipp test script page

There is a catalogue of examples provided by students. During this stage of the practical you must contribute at least one test program to this.

Submission

To submit your code, execute the ipp2submit command. This will hand in all text files in your ~/ipp2/ directory. You can submit as many times as you like -- only the last version will be kept, with a record of the time of submission. You must submit your work by 1pm on Friday 16 February.

Here are the necessary steps, in detail.

You should probably begin by taking a backup copy of all your code: cp -r ~/ipp2 ~/ipp2backup
Make sure that ~/ipp2/ contains all the text files needed to recreate your occam program.
Enable the submission program with setpath /home/cs3/ipp/bin
Run it with ipp2submit. This will ask you to confirm the list of text files to submit. If you had already made an earlier submission, the program will ask for confirmation to overwrite that. Finally, if submission is successful then you will be sent an email saying which files were handed in.

The submission program will give errors if there is no ~/ipp2/ directory, or if it appears to be readable by anyone other than yourself. If it fails for any other reason, please email me with a copy of the error message.

The submission program will be disabled once the deadline has passed. One day later it will be reenabled for late submissions. Students who fail to hand in any source code before the deadline may make a late submission, up to 1pm on Friday 23 February. Late submissions will have their mark reduced by one-third, following the guidelines in the CS3 handbook and the advice of the CS3 course organiser.

A few students have been granted extensions on medical grounds. They should follow the same procedure, but will not receive any grade penalty. Please note that such extensions can only be arranged before the deadline, and with the support of Directors of Studies.

Assessment

Once submitted, your source code will be placed in a directory of its own. The make command will be executed in that directory to build your compiler. The resulting executable occam will be run once with each of a batch of test µOCCAM files, and its behaviour recorded and compared with that expected.

Credit for the practical is assessed over a range of areas, including the following.

Contributing at least one sensible test program to the catalogue.
Presenting source code that will compile and execute, and makes some improvement on the minimal µOCCAM compiler provided.
Code that is clear and comprehensible.
The performance of the submitted compiler on a range of test programs.

The last item is the most substantial.

Separately from the assessment, your submitted files may undergo a certain amount of analysis to discount plagiarism.