The MU5 Instruction Set

MU5 was the fifth large scale computer system designed and built at the University of Manchester [1]. Design work started in the mid 1960s and MU5 became operational in the early 1970s. The design of the ICL 2900 Series of computers owed much to MU5, though ICL did not formally acknowledge this until 1998.

During the design of the MU5 processor the prime target was fast, efficient processing of high-level language programs in an interactive environment. This requirement led to the need for an instruction set which satisfied the following conditions

Generation of efficient code by compilers must be easy.
Programs must be compact.
The instruction set must allow a pipeline organisation of the processor leading to a fast execution rate.
Information on the nature of operands should be available to allow optimal buffering of operands.

Algol and FORTRAN were taken as being typical of two major classes of programming languages, distinguished by the use of recursive routines (or procedures) and dynamic storage allocation in the former, and non-recursive routines and compile time storage allocation in the latter, but having the following features in common

Each routine has local working space consisting of named variables of integer (fixed-point) and real (floating-point) types, and structured data sets (arrays, for example) which are also named.
A routine may refer to non-local names either of an enclosing routine or in a workspace common to all routines.
Statements involve an arbitrary number of operands which may be real names, array elements, function calls or constants.

In order to reconcile item (c) with conditions (1) and (4), it was decided that there would be an address form corresponding to each of the different operand forms. However, it was felt that to have more than one such operand specification per instructions would conflict with conditions (2) and (3) (although subsequent simulation experiments showed that this was not necessarily true), and the use of addressable fast registers was also rejected. There were two arguments against registers. The first was the desire to simplify the software by eliminating the need for optimising compilers. The second was the desire to avoid the need to dump and restore register values during procedure entry and exit and process changes. The alternative arrangement proposed for MU5 was a small associatively addressed Name Store, described under Storage Hierarchies.

The choice of instruction format was therefore limited to zero or one-address. A zero-address system was rejected on two main grounds. Firstly, the hand-coded library procedures which would be used for input/output handling, mathematical functions, etc., almost all required more code in zero-address than in one-address instructions. This was because the main calculation, the address calculations and control counting tended to interfere with each other on the stack. Secondly, execution of statements such as

A := B + C

would actually run more slowly on a stacking machine with the proposed hardware organisation for MU5. This statement would be coded in one-address and zero-address instructions as

One-address	Zero-address
`ACC = B`	`STACK B`
`ACC + C`	`STACK C`
	`ADD`
`ACC => A`	`STORE A`

If the operands A, B and C were held in main store, then operand accessing would dominate the execution times of these sequences, and both would be approximately the same. However, using the Name Store, the access times to these operands and to the stack would be the same, and the zero-address sequence would involve six stack accesses (the ADD involves two reads and a write) in addition to the operand accesses.

The instruction format eventually chosen for MU5 represents a merger of single address and stacking machine concepts. All the arithmetic and logical operations involve the use of an accumulator and an operand specified in the instruction, but there is a variant of the load order (*=) which causes the accumulator to be stacked before being re-loaded. In addition, there is an address form (STACK) which unstacks the last stacked quantity. Thus the expression

RESULT := (A + B) * ((C + D)/(E + F))

would compile into

ACC = A ACC + B ACC *= C ACC + D ACC *= E ACC + F ACC /: STACK ACC * STACK ACC => RESULT

The function /: is REVERSE DIVIDE; if the operand to the left of an operator is stacked, it subsequently appears as the right hand side of a machine function (as the dividend in this case). Thus, for the non-commutative operations - and /, the reverse operations have to be provided. In pure stacking machines all subtractions and divisons are implicitly reverse operations.

The instruction set is summarised in the figure. There were two basic formats, one for computational functions and one for organisational functions (control transfers, base register manipulations, etc.). They were distinguished by the type of function bits and used four and six bits respectively to specify the function. The remaining nine or seven bits specified the primary operand. The k field indicated the kind of operand, and determined how the n field was interpreted. Where the k field indicated the Extended Operand Specification, the n field was interpreted as Kn', where K indicated the kind of operand, and the instruction was extended by an additional 16 bits in the case of a name (N), or by 16, 32, or 64 bits in the case of a literal. The n' field indicated the size and type of the literal or the base to which N was to be added.

The formal segmentation of the virtual address space, and the unique identification of store words belonging to different processes were achieved by splitting the address into three separate fields (thus producing 3-dimensional addressing), a 4-bit Process Number, a 14-bit Segment Number, and a 16-bit address defining a 32-bit word within a segment. Scalar variables were normally held in a single segment of a process at addresses formed by adding the 6 or 16-bit name in the instruction to the contents of the Name Base register (NB). The name segment number and the process number (both held in special registers in the processor) were concatenated with these addresses to form the full virtual addresses required for store accessing. The value held in NB was unique to the name space associated with each procedure, and was altered at each procedure change. Access to names of other procedures and to common variables was made using the Extra Name Base (XNB), while using zero as base allowed absolute addressing into the Name Segment (to access the Name Base values of different procedures, for example). The Stack Front register (SF) pointed to the area immediately beyond the name space of the current procedure, so that stacked variables were contained within the same address space as the names. SF was automatically incremented or decremented during the execution of a stacking function or a STACK operand access respectively.

Access to array elements was achieved through a descriptor mechanism. When a data structure element is specified by the k field, the corresponding primary operand is a 64-bit descriptor. This descriptor was sent to a D-unit within the processor, together with the contents of the B register if the k field specified modification. The D-unit interpreted the descriptor and made a secondary store access for the element. It was this detachment of secondary addresses from instructions, and the use of names within instructions, which allowed the full 30-bit virtual address space available to a process to be referenced by 16-bit instructions.

There were two main types of descriptor

String descriptor: Ts (8 bits) Length (24 bits) Origin (32 bits)

Vector descriptor: Tv (8 bits) Bound (24 bits) Origin (32 bits)

String descriptors describe strings of bytes. The length field specifies the number of bytes, and the origin specifies the byte address of the first byte. Strings are normally used in conjunction with string processing (Store-to-Store) orders (which may be two-address operations involving the use of two descriptors). These orders provide facilities for the manipulations of data records in languages such as COBOL, PL/1 and Algol 68 (move, compare, logically combine, etc.).

The size of the elements in a vector is specified within the type bits (Tv) and may be between 1 and 64 bits. If the operand access is modified, the modifier must be greater than or equal to zero and less than the bound. This check was carried out automatically in MU5 by hardware with virtually no time penalty, and removed the need for the costly software bound check which is often required, particularly during program development. The array bound problem (i.e. the problem of accesses to array elements reading or writing data outside the bounds of the array) persists to this day and has been the cause of numerous security lapses in operating systems. Descriptor accessing mechanisms are discussed in more detail in the section on Vector Processors.