System architecture

The prototype connection machine, known as CM-1, was manufactured by Thinking Machines Corporation (TMC), with the primary design goal of testing out the principles of connection machine architecture. The CM-1 is only one of many possible implementations of a connection machine - it could be argued that the DAP is also a connection machine.

The system level architecture of CM-1 is illustrated in the figure, wherein the similarity to the DAP (and most other SIMD array processors) is clearly visible. The array of processing elements, comprising a simple boolean processor and some local memory, is seen by the host machine simply as an extended region of memory. The host computer directs the connection machine to implement parallel portions of code, and in this respect it differs from the DAP which has an instruction processor built into the array unit. The CM-1 host broadcasts a sequence of instructions to the array micro-controller, which interprets the instructions and, for each received host instruction, broadcasts an appropriate sequence of micro-instructions to the array of PEs.

The processor-memory cells, like those of the DAP, are so small and slow that individually they cannot perform meaningful computations. In CM-1, running CM-Lisp, these cells are linked together in data-dependent patterns called active data structures. Low-level operations on active data structures can be evaluated in parallel by the low-level boolean processors acting in concert on their local segments of those structures. This is how Connection Machines exploit parallelism and sustain high processing rates.

Network structure

An important feature of a connection machine is its support for programmable links between PEs. In the DAP, when one processor communicates with its Northern neighbour all processors must communicate with their Northern neighbour, or not at all. This is because the DAP has a static square-mesh communication network, which only supports eight routing functions. Communication in the CM-1 is significantly more powerful than this, since each group of sixteen processing elements share a link into a packet-switched binary 12-cube network, as well as having individual connections to a DAP-like grid (known as the North-East-West-South, or NEWS grid). Essentially this means that all PEs can compute the address of a PE to which they want to send a message, and then use the 12-cube network to route the message in logarithmic time. A two-dimensional grid routes messages in O(√N) time, where N is the number of PEs. A full set of N^N permutations is supported by a dynamic binary k-cube network, where k = log₂N, and in the case of CM-1 this produces a quoted worst-case bandwidth of ~3.2 x 10⁷ bits/s and a best-case bandwidth of ~1.0 x 10⁹ bits/s.