\chapter{The Parser and Code Generator}

This chapter contains only a minimal parser.  Like the lexer, it ignores
almost every part of the language; it also generates no meaningful code.  It
does however interact properly with the rest of the \OCCAM\ compiler, so makes
a suitable basis for improvement.

\section{Parser files}
\label{sec:parserfiles}

Only one function is exported by the parser.
@o code/parser.h @{
#ifndef _PARSER_H
#define _PARSER_H
extern int yyparse(void);
#endif
@| yyparse @}
The parser description is in the standard format.
@o code/parser.y @{
%{
@< Parser C declarations @>
%}
@< Parser Bison declarations @>
%%
@< Parser grammar rules @>
%%
@< Parser user code @>
@}
We put a suitable note of dependencies in the Makefile, and record the object
file it will generate. 
@d Makefile dependencies @{
parser.tab.o: occam.h pool.h interpreter.h symbol_table.h lexer.h parser.h
parser.tab.c parser.tab.h : parser.y
	bison --defines parser.y
@}
@d Occam object files @{ parser.tab.o @}


\section{Declarations}

The only declarations required by C are the included header files.
@d Parser C declarations @{
@< Standard \OCCAM\ includes @>

#include "pool.h"
#include "symbol_table.h"
#include "occam.h"
#include "interpreter.h"
#include "parser.h"
@}
Bison needs to know what tokens to expect.
@d Parser Bison declarations @{
%token NL
%token STOP
%token SEQ
@}

\section{Grammar}

The start symbol is \verb|program|, which consists of a single process.
@d Parser grammar rules @{

program : 
@< Initialise symbol table for parser @>
  process
@< Build code for a trivial program @>
@}
A process can either be a command or some more complex construction.  In this
tiny parser, \verb|STOP| is the only command and \verb|SEQ| the only
constructor.
@d Parser grammar rules @{
process: STOP NL | sequence;

sequence: SEQ NL processes;

processes:                    /* no processes at all */
         | process processes; /* one or more processes */
@}
Notice the blank entry representing $\varepsilon$; a list of processes might
include none at all.

Before the program is even started, the interpreter initialises processes to
handle \verb|stdin| and \verb|stdout|.  So we have to tell the symbol table
that these things are already on the stack.
@d Initialise symbol table for parser @{
    { insert_name (display, 2, DISPLAY_T);         
      insert_name (chanblock,  CHANNEL_BLOCK_SIZE, CHAN_BLOCK_T);
      insert_name (find_name("stdin" ), 1, CHAN_T);
      insert_name (chanblock,  CHANNEL_BLOCK_SIZE, CHAN_BLOCK_T);
      insert_name (find_name("stdout"), 1, CHAN_T); }
@}
Rather than actually generate real code while parsing, this minimal solution
writes only the ``stop now'' code to go at the end.
@d Build code for a trivial program @{
    { program_code = build_ccode(exit_scope_code(), Stop, END); }
@}
The call to \verb|exit_scope_code| generates appropriate code to deallocate
one level of scoped values from the stack --- these will be the channels for
\verb|stdin| and \verb|stdout| mentioned above.

The variable \verb|program_code| is where the parser must put the code
sequence for the whole program.  See Section~\ref{codemanip} for some
information on how to construct code sequences.


\section{User code}

The only additional user code required is that for error reporting.
@d Parser user code @{
yyerror (char* s)
{
  error ("Parse error", "Unrecognized input.\n");
}
@}