\chapter{The Parser and Code Generator} This chapter contains only a minimal parser. Like the lexer, it ignores almost every part of the language; it also generates no meaningful code. It does however interact properly with the rest of the \OCCAM\ compiler, so makes a suitable basis for improvement. \section{Parser files} \label{sec:parserfiles} Only one function is exported by the parser. @o code/parser.h @{ #ifndef _PARSER_H #define _PARSER_H extern int yyparse(void); #endif @| yyparse @} The parser description is in the standard format. @o code/parser.y @{ %{ @< Parser C declarations @> %} @< Parser Bison declarations @> %% @< Parser grammar rules @> %% @< Parser user code @> @} We put a suitable note of dependencies in the Makefile, and record the object file it will generate. @d Makefile dependencies @{ parser.tab.o: occam.h pool.h interpreter.h symbol_table.h lexer.h parser.h parser.tab.c parser.tab.h : parser.y bison --defines parser.y @} @d Occam object files @{ parser.tab.o @} \section{Declarations} The only declarations required by C are the included header files. @d Parser C declarations @{ @< Standard \OCCAM\ includes @> #include "pool.h" #include "symbol_table.h" #include "occam.h" #include "interpreter.h" #include "parser.h" @} Bison needs to know what tokens to expect. @d Parser Bison declarations @{ %token NL %token STOP %token SEQ @} \section{Grammar} The start symbol is \verb|program|, which consists of a single process. @d Parser grammar rules @{ program : @< Initialise symbol table for parser @> process @< Build code for a trivial program @> @} A process can either be a command or some more complex construction. In this tiny parser, \verb|STOP| is the only command and \verb|SEQ| the only constructor. @d Parser grammar rules @{ process: STOP NL | sequence; sequence: SEQ NL processes; processes: /* no processes at all */ | process processes; /* one or more processes */ @} Notice the blank entry representing $\varepsilon$; a list of processes might include none at all. Before the program is even started, the interpreter initialises processes to handle \verb|stdin| and \verb|stdout|. So we have to tell the symbol table that these things are already on the stack. @d Initialise symbol table for parser @{ { insert_name (display, 2, DISPLAY_T); insert_name (chanblock, CHANNEL_BLOCK_SIZE, CHAN_BLOCK_T); insert_name (find_name("stdin" ), 1, CHAN_T); insert_name (chanblock, CHANNEL_BLOCK_SIZE, CHAN_BLOCK_T); insert_name (find_name("stdout"), 1, CHAN_T); } @} Rather than actually generate real code while parsing, this minimal solution writes only the ``stop now'' code to go at the end. @d Build code for a trivial program @{ { program_code = build_ccode(exit_scope_code(), Stop, END); } @} The call to \verb|exit_scope_code| generates appropriate code to deallocate one level of scoped values from the stack --- these will be the channels for \verb|stdin| and \verb|stdout| mentioned above. The variable \verb|program_code| is where the parser must put the code sequence for the whole program. See Section~\ref{codemanip} for some information on how to construct code sequences. \section{User code} The only additional user code required is that for error reporting. @d Parser user code @{ yyerror (char* s) { error ("Parse error", "Unrecognized input.\n"); } @}