Next: Control Decoupling Up: Introduction Previous: Introduction

Access Decoupling

In a decoupled access/execute (DAE) architecture, the processes of accessing memory and performing computations on values fetched from memory can be thought of as semi-independent micro-threads implemented on separate asynchronous units. In ACRI-1 terminology, the Address Unit (AU) computes addresses and issues memory requests. The Data Unit (DU) receives data from memory, computes new values, and sends new data back to memory. The AU and DU execute a program containing the instructions that are specific to each unit. The only constraint on the AU and DU programs is that the order in which operand addresses are issued by the AU must match precisely the order in which operands are used by the DU.

The AU tags each memory fetch request with the location in a load queue within the DU where the incoming data will be buffered. This tag permits the physical memory system to process requests in any order; the DU load queue re-orders them as they arrive, ensuring that the AU-DU ordering constraint mentioned earlier is always satisfied. In the ACRI-1 architecture there are two independent load paths to memory, and two independent load queues in the DU.

The AU is optimised to implement the most common operations on induction variables. Thus, it has a simple integer instruction set and instruction modes which permit operations of the form . In a single instruction an induction variable can be incremented by some constant value (or the contents of a register) and the result can be stored back to the induction variable as well as being sent to memory as a load or store address. In the ACRI-1 architecture two load addresses and one store address can be computed and sent to memory in each cycle.

The memory of the ACRI-1 system is highly interleaved to provide the required bandwidth of two loads and one store per cycle (per processor). In addition, the bank selection logic implements a pseudo-random hashing function [8][7] to ensure an even spread of addresses to banks; even in the presence of strides that would cause serious performance problems in traditional vector machines.



Next: Control Decoupling Up: Introduction Previous: Introduction


ships@dcs.ed.ac.uk
Wed Mar 1 16:43:22 GMT 1995