Control Decoupling

Next: Overview of the Up: Introduction Previous: Access Decoupling

Control Decoupling

In the ACRI-1 architecture, control transfers are partitioned into two groups; those which implement leaf-level loop control (leaf-level loops are those without internal cycles), and those which implement all other control transfers. The AU and DU have the capability to implement simple looping constructs, and this permits the compiler to target leaf-level loop control directly on to the AU and DU. All remaining control transfers are executed by a third unit, the Control Unit (CU). Effectively the CU controls the sequencing of the program through its flow graph, dispatching leaf-level loops intact to the AU and the DU.

Control decoupled architectures share some similarities with vector processors, in which a scalar unit dispatches vector instructions to a vector load pipeline and vector arithmetic pipelines, however, the differences are significant. Firstly, the body of the leaf loop on the AU and the DU is derived directly from the source code without any need to vectorize. Secondly, the compiler's partitioning of code between units is driven by data dependencies and not by what instructions can or cannot be vectorized. Thirdly, there is a high degree of asynchrony between the three units, and this permits the CU, for example, to enqueue loop dispatch blocks for the AU and DU well in advance of their being executed. The CU is, in many ways, a natural (prefix) extension of the virtual pipeline connecting the AU to the DU through memory.

ships@dcs.ed.ac.uk
Wed Mar 1 16:43:22 GMT 1995