System architecture
The prototype connection machine, known as CM-1, was manufactured by
Thinking Machines Corporation (TMC), with the primary design goal
of testing out the principles of connection machine architecture. The
CM-1 is only one of many possible implementations of a connection
machine - it could be argued that the DAP is also a connection machine.
The system level architecture of CM-1 is illustrated in the figure,
wherein the similarity to the DAP (and most other SIMD array
processors) is clearly visible. The array of processing elements,
comprising a simple boolean processor and some local memory, is seen
by the host machine simply as an extended region of memory. The host
computer directs the connection machine to implement parallel portions
of code, and in this respect it differs from the DAP which has an
instruction processor built into the array unit. The CM-1 host
broadcasts a sequence of instructions to the array micro-controller,
which interprets the instructions and, for each received host
instruction, broadcasts an appropriate sequence of micro-instructions
to the array of PEs.
The processor-memory cells, like those of the DAP, are so small and
slow that individually they cannot perform meaningful computations. In
CM-1, running CM-Lisp, these cells are linked together in
data-dependent patterns called active data structures.
Low-level operations on active data structures can be evaluated in
parallel by the low-level boolean processors acting in concert on
their local segments of those structures. This is how Connection
Machines exploit parallelism and sustain high processing rates.
Network structure
An important feature of a connection machine is its support for
programmable links between PEs. In the DAP, when one processor
communicates with its Northern neighbour all processors must
communicate with their Northern neighbour, or not at all. This is
because the DAP has a static square-mesh communication network, which
only supports eight routing functions. Communication in the CM-1 is
significantly more powerful than this, since each group of sixteen
processing elements share a link into a packet-switched binary 12-cube
network, as well as having individual connections to a DAP-like grid
(known as the North-East-West-South, or NEWS grid). Essentially this
means that all PEs can compute the address of a PE to which they want
to send a message, and then use the 12-cube network to route the
message in logarithmic time. A two-dimensional grid routes messages in
O(√N) time, where N is the number of PEs. A
full set of NN permutations is supported by a dynamic
binary k-cube network, where k = log2N, and in the
case of CM-1 this produces a quoted worst-case bandwidth of ~3.2 x
107 bits/s and a best-case bandwidth of ~1.0 x
109 bits/s.