The architecture of the NCUBE processor was similar in many respects to the T800, except for the presence of eleven links rather than four. Internally the NCUBE processor contained a 32-bit integer ALU with shifter, 16 general-purpose registers, 13 special-purpose registers, a 64-bit IEEE standard floating-point unit, an instruction cache, a memory interface and eleven bi-directional serial channels. The extra link on each processor was used to support distributed I/O.
The processor was organised as a four-stage pipeline and was able to execute simple register-to-register instructions at a peak rate of one every 200 ns. Unconditional branch instructions (branching within the cache) took 500 ns and conditional branches took 600 ns. However, the floating-point performance of the NCUBE was at most 0.5 MFLOPS per processor compared with 1.5 MFLOPS for a 20 MHz IMS T800.
Perhaps the most impressive feature of the NCUBE was its physically compact construction. The small amount of memory in each processing element meant that 64 processing elements could be accommodated on a single printed circuit board and hence 1024 elements were contained in a single rack on just 16 boards. Consequently all communication signals were less than 24" long.