III. General Purpose Image Processing
Architectures and Machines
The computer
architectures has been classified by Flynn
(1966) into four major categories:
·
SISD (Single Instruction stream Single
Data stream): This is the serial computer architecture.
·
MISD (Multiple Instruction stream
Single Data stream): This involves several processors, which execute different
instructions on the same datum. This category is hypothetical and it is
concerned as impractical.
·
SIMD (Single Instruction stream
Multiple Data stream): Several processors execute the same instruction on
different data.
·
MIMD (Multiple Instruction stream
Multiple Data stream): Several processors execute different instructions on
different data.
Figure 4. Decomposition of a 9
9-pixel structuring element into 9 3
3-pixel structuring elements.
In image
processing, due to the vast amount of data, parallelism is utilized. The
parallel architectures can be grouped into two major categories, synchronous
and MIMD (Duncan
1990). However this grouping is not unique and overlapping
between the categories may be found in the literature (see e.g. (Jonker
1992; Petkov 1993)). According to (Duncan
1990) the categorization is as follows:
·
Synchronous
·
Vector processors
·
SIMD
·
Systolic architectures
·
MIMD
·
Distributed memory
·
Shared memory
Vector processors
utilize multiple processing elements coupled on a multiple bus. These elements
are capable of operating both logical and arithmetic operations on either
vector or scalar numbers. The Delft Image Processor (Gerritsen and Aardema
1981; Jonker 1992) (DIP), shown in Fig. 5, can be characterized as a
vector processor. This processor is capable of generating image and
neighborhood points, it can handle floating point numbers and it is able to
perform recursive operations.
Figure 5. The Delft Image Processor (DIP).
Figure 6. SIMD architecture.
In SIMD
architecture a control unit is used, that provides a single instruction to all
the processing elements concurrently, as shown in Fig. 6. The processing
elements (PEs) execute the instruction on data, which is stored in local
memory. SIMD is the suitable architecture for fast low-level image
processing (Gerritsen and
Aardema 1981; Jonker 1992; Reeves
1984), where mathematical morphology belongs. Examples of
SIMD machines are the CLIP4 (Duff
1982; Preston et al. 1979) the CLIP7A (Fountain
et al. 1988) the AIS-5000 parallel processor (Schmitt and Wilson
1988) and the PAPRICA system (Broggi et al. 1998; Broggi et al.
1997). The processing element of CLIP4 and its network
topology are shown in Figs. 7a and 7b, respectively. As it is shown in this
figure each processing element communicates with its eight nearest neighbors.
For cellular logic operations the neighborhood can be accessed in parallel, as
it happens in all the architectures based on cellular automata. However it is
possible to perform recursive neighborhood operations, as well, via feedback
obtained by the carry flip-flop. The CLIP7 chip is a successor of CLIP4. This
is supplied with more memory per processor, improved image I/O speeds,
increased local control and multi-bit structure.
(a)
(b)
Figure 7.The CLIP4 image processor: (a) block diagram of
the processing element and (b) network topology.
The AIS-5000
system consists of an SIMD array parallel processor, a general-purpose
microprocessor and high-speed I/O hardware for the parallel processor. The
parallel array can contain up to 1024 programmable PEs with local memory and
connections with two neighbors PEs, as shown in Fig. 8a. The PEs are
programmable and can perform several operations, such as Boolean, arithmetic
and neighborhood operations. The implementation of such a neighborhood
operation, named erode_nbr, is shown in Fig. 8b. The PE is programmed via the
truth table, which is given in hexadecimal number. An example of a truth table,
the corresponding hexadecimal number and the structuring element are provided
in Fig. 8c.
(a)
(b)
(c)
Figure 8.
(a) The
AIS-5000 parallel array topology and (b) implementation of binary erosion on
the AIS-5000 system (c) a structuring element and the corresponding truth table
for binary erosion.
The PAPRICA system
has been designed and realized as a massively parallel morphological
coprocessor. It based on a hierarchical morphological computational model.
PAPRICA was designed to be attached on general-purpose machines. It comprises
five major functional parts (Fig. 9a). These are the frame grabber (grabs 512
512 images with 8 bit/pixel resolution at video
rate), the program memory, the image memory, the processor array and the
control unit. The main part ofPAPRICA system is its processor array in which
fits exactly the input image. This array consists of two different type
internal registers: the morphological registers (MOR) and the logical register
(LOR), as shown in Fig. 9b (PAPRICA-3 version with 256 PEs). The MOR consist of
5-bit shift-registers, which allow to perform morphological operations on an
enhanced 3
3, cross-like, neighborhood as the one shown in Fig.
9c. The LOR are used to perform temporary storage, as well as for temporary
storage.
(a)
(b)
(c)
Figure 9. (a) Block diagram of the PAPRICA system, (b)
the PE topology of PAPRICA-3 and (c) the enhanced 3
3neighborhood.
Systolic
architectures or systolic arrays are the most widely used architectures for
low-level image processing. A systolic array is designed to embody a certain
algorithm in hardware. The PEs of a systolic architecture are rigid, i.e. they
are not programmable. The data is pipelined in rhythmic fashion from the memory
to the network of the PEs and it returns back to the memory. The network
topology of a systolic array, denoting the data flow path is shown in Fig. 10.
The synchronization of data is achieved through a global clock, which controls
all the PEs. Implementations of systolic arrays for morphological image
processing are analytically described in the next section.
Figure 10.Systolic array network topology and data flow
path.
The MIMD
architectures are discriminated into distributed-memory architectures (ring,
mesh, pyramid, hypercube) and shared-memory architectures (Hwang and Briggs
1987). The distributed memory architectures consist of
several processors interconnected in various ways. Each processor communicates
with its local memory as well as with the adjacent processors. Fig. 11a depicts
the ring topology architecture. This topology is appropriate for a limited
number of nodes and it is suitable for algorithms, which do not demand vast
data exchange. A more sophisticated topology is the mesh topology (Rosenfeld
1983), shown in Fig. 11b. This can be applied into
algorithms involved with matrixes, such as 2-D image processing. However, many
low and intermediate level image processing algorithms require comparison of
pixels which are distant located; in this topology the communication between
two nodes not belonging in the same neighbor is slow. An extension of the mesh
topology is the pyramid topology. This consists of overlapping meshes with
successively lower resolution, as shown in Fig. 11c.
(a) (b) (c)
Figure 11.Distributed-memory network topologies: (a)
ring, (b) mesh and (c) pyramid.
The pyramid
topology is suitable for the implementation of low and intermediate level image
processing; especially those that are dominated by data transaction between
spatial distance pixels. Another more complicated topology is the hypercube
topology (Frieder
1990). According to this architecture the nodes are located
on the corners of an n dimension cube. A machine of dimension d,
built on this architecture, has 2d nodes, each being
connected to its d neighbor nodes. Figs. 12a, 12b, 12c and 12d depict
the topology of a one-dimensional (1-D), a two-dimensional (2-D), a
three-dimensional (3-D) and a four-dimensional (4-D) hypercube respectively. A
paradigm of a hypercube structure `(enhanced hypercube) for machine vision
applications is Proteus (Haralick
et al. 1995). The topology of Proteus architecture is shown in
Fig. 13. It is a hierarchical re-configurable interconnection network with its
highest level being a circuit switched enhanced hypercube. The Proteus system
is capable of performing morphological image processing tasks such as dilation,
erosion, closing and opening. It accepts the data via multiple links at a rate
up to 640MBps.
Figure 12. Hypercube network topologies: (a) 1-D, (b)
2-D, (c) 3-D and (d) 4-D.
Figure 13. The Proteus enhanced hypercube topology.
In shared memory
architectures the PEs are controlled by a common memory. Generally speaking
there are two ways of interfacing the PEs with memory. The simple one consists
of a fast bus on which the PEs and the memory are tied Fig. 14a. Such machines
typically consist of up to 30 PEs. The more sophisticated way comprises a
multi-layer interface network, which is responsible to interconnect the PEs
with the pieces of memory Fig. 14b. The main advantage of the machines built on
with shared memory architecture is that they are easily implemented. These
machines though are rather used in high-level, than in low-level image
processing. Although such machines are applicable to parallel image processing,
where each PE is devoted to a certain particle of the image, they are not as
sufficient as SIMD. Due to the synchronization that should be applied on the
PEs, the system becomes time-consuming. However, many structures adopt cash
memory in each PE, in order to enhance their speed.
(a)
(b)
Figure 14.Shared memory architectures: (a) fast bus and
(b) multi-layer interface network.
References
Abbott L, Haralick R M and Zhuang X (1988) Pipeline Architectures
for Morphologic Image Analysis, Machine Vision and Applications, 1
(1): 23-40.
Andreadis I, Gasteratos A and Tsalides P (1996) An ASIC for Fast
Grey-Scale Dilation, Microprocessors and Microsystems, 20 (2):
89-95.
Bloch I and Maitre H (1995) Fuzzy Mathematical Morphologies: A
Comparative Study, Pattern Recognition, 28 (9): 1341-1387.
Botha E C and Casasent D P (1989) Applications of Optical
Morphological Transformations, Optical Engineering, 28 (5):
501-505.
Broggi A, Conte G, Gregoretti F, Sansoe C, Passerone R and Reyneri L
M (1998) Design and Implementation of the PAPRICA Parallel Architecture, The
Journal of VLSI Signal Processing, 19 (1): 5-18.
Broggi A, Conte G, Gregoretti F, Sansoe C and Reyneri L M (1997) The
Evolution of the PAPRICA System, Computer-Aided Engineering Journal, 4
(2): 114-136.
Camps O I, Kanungo T and Haralick R M (1996) Grey-Scale Structuring
Element Decomposition, IEEE Transactions on Image Processing, 5
(1): 111-120.
Casasent D (1994) General-Purpose Optical Pattern Recognition Image
Processors, Proceedings of the IEEE, 82 (11): 1724-1734.
Chen C and Yang D L (1995) Realisation of Morphological Operations, IEE
Proceedings: Circuits Devices and Systems, 142 (6): 364-368.
Chen K (1989) Bit-Serial Realizations of a Class of Nonlinear
Filters Based on Positive Boolean Functions, IEEE Transactions on Circuits
and Systems, 36: 785-794.
Diamantaras I and Kung S Y (1997) A Linear Systolic Array for
Real-Time Morphological Image Processing, Journal of VLSI Signal Processing,
17 (1): 43-55.
Diamantaras I, Zimerman K H and Kung S Y (1990) "Integrated
Fast Implementation of Mathematical Morphology Operations in Image
Processing", IEEE International Symposium on Circuits and Systems,
New Orleans, pp 1442-1445.
Dougherty E R and Astola J (1994): Introduction to Non-linear
Image Processing, SPIE, Bellingham, Washington.
Duff M J B (1982): CLIP4, in Fu K S and Ichikawa T, eds., Special
Computer Architectures for Pattern Processing, CRC Press, Boca Raton,
Florida.
Duff M J B (1988) "Some Constrains on the Limitations of Image
Processing Computer Architectures", IARP Workshop on CV - Special
Hardware and Industrial Application, Tokyo, Japan, pp 1-5.
Duncan R (1990) A Survey of Parallel Computer Architectures, IEEE
Computer, 23: 5-16.
Fitch J P (1987) Software and VLSI Algorithm for Generalized Ranked
Order Filtering, IEEE Transactions on Circuits and Systems, CAS-34
(5): 553-559.
Fitch J P, Coyle E J and Gallagher Jr. N C (1984) Median Filtering
by Threshold Decomposition, IEEE Transactions on Acoustics Speech and Signal
Processing, ASSP-32 (6): 1183-1188.
Fitch J P, Coyle E J and Gallagher Jr. N C (1985) Threshold
Decomposition of Multidimensional Ranked Order Operations, IEEE Transactions
on Circuits and Systems, CS-32 (5): 445 450.
Flynn M J (1966) Very High Speed Computing Systems, Proceedings
of the IEEE, 56 (12): 1901-1909.
Fountain T J, Matthews K N and Duff M J B (1988) The CLIP7A Image
Processor, IEEE Transactions on Pattern Analysis and Machine Intelligence,
10 (3): 310-319.
Frieder O (1990) Multiprocessor Algorithms for Relational-Database
Operator on Hypercube Systems, IEEE Computer, 23: 13-28.
Gader P D (1991) Separable Decompositions and Approximations of
Greyscale Morphological Templates, CVGIP: Image Understanding, 53
(3): 288-296.
Gasteratos A and Andreadis I (1999): Soft Mathematical
Morphology: Extensions, Algorithms and Implementations, in Hawkes P W, ed.,
Advances in Imaging and Electron Physics, 110, Academic Press,
pp. 63-99.
Gasteratos A and Andreadis I (2000) Non-Linear Image Processing in
Hardware, Pattern Recognition, Special Issue on
Mathematical Morphology, 33 (6): 1013-1021.
Gasteratos A, Andreadis I and Tsalides P (1996) Improvement of the
Majority Gate Algorithm for Grey-Scale Dilation/Erosion, Electronics Letters,
32 (9): 806-807.
Gasteratos A, Andreadis I and Tsalides P (1997a) Extension And Very
Large Scale Integration Implementation Of The Majority-Gate Algorithm For
Gray-Scale Morphological Operations, Optical Engineering, 36 (3):
857-861.
Gasteratos A, Andreadis I and Tsalides P (1997b) Realization of
Rank-Order Filters Based on Majority Gate, Pattern Recognition, 30
(9): 1571-1576.
Gasteratos A, Andreadis I and Tsalides P (1998a) Fuzzy Soft
Mathematical Morphology, IEE Proceedings: Vision Image and Signal Processing,
145 (1): 40-49.
Gasteratos A, Andreadis I and Tsalides P (1998b) Realisation of Soft
Morphological Filters, IEE Proceedings - Circuits Devices and Systems, 145
(3): 201-206.
Gasteratos A, Andreadis I and Tsalides P (1998c) "Soft
Morphological Structuring Element Decomposition", International
Symposium on Mathematical Morphology '98, Amsterdam, The Netherlands, pp
407-413.
Gerritsen F and Aardema L G (1981) Design and Use of DIP-1: A Fast
Flexible and Dynamicaly Microprogrammable Image Processor, Pattern
Recognition, 14 (1): 1-6.
Giardina C R and Dougherty E R (1988): Morphological Methods in
Image and Signal Processing, Prentice-Hall, Englewood Cliffs, New Jersey.
Haralick R M and Shapiro L G (1992): Computer and Robot Vision,
Addison-Wesley, New York.
Haralick R M, Somani A K, Wittenbrink C, Johnson R, Cooper K,
Shapiro L G, Phillips I T, Hwang J-N, Cheung W, Yao Y H, ChenC-H, Yang L,
Daugherty B, Loberski B, Loving K, Miller T, Parkins L and Soos S (1995)
Proteus: a reconfigurable computational network for computer vision, Machine
Vision and Applications, 8 (2): 85-100.
Haralick R M, Sternberg S R and Zhuang X (1987) Image Analysis Using
Mathematical Morphology, IEEE Transactions on Pattern Analysis and Machine
Intelligence, 9 (4): 532-550.
Heijmans H J A M and Roerdink
J B T M, eds., (1998): Mathematical
Morphology and its Applications to Image and Signal Processing, Kluwer
Academic Publishers, Dordrecht-Boston-London.
Hwang K and Briggs F A (1987): Computer Architecture and Parallel
Processing, McGraw-Hill, Singapore.
Jonker P P (1992): Morphological Image Processing : Architecture
and VLSI Design, Kluwer Academic Publishers, Amsterdam, The Netherlands.
Klein J C and Serra J (1977) The Texture Analyzer, Journal of
Microscopy, 95: 349-356.
Ko S J, Morales A and Lee K H (1995) A Fast Implementation Algorithm
and a Bit Serial Realization Method for Greyscale Morphological Opening and
Closing, IEEE Transactions on Signal Processing, 43 (12):
3058-3061.
Kojima S and Miyakawa T (1996) One-Dimensional Processing
Architecture for Gray-Scale Morphology, Systems and Computers in Japan, 27
(12): 1-9.
Koskinen L, Astola J and Neuvo Y (1991) "Soft Morphological
Filters", Proceedings of SPIE Symposium on Image Algebra and
Morphological Image Processing, 262-270.
Kuosmanen P and Astola J (1995) Soft Morphological Filtering, Journal
of Mathematical Imaging and Vision, 5 (3): 231-262.
Lin R and Wong E K (1992) Logic Gate Implementation for Gray-Scale
Morphology, Pattern Recognition Letters, 13: 481-487.
Luck L and Chakrabaty C (1995) A Digit-Serial Architecture For
Gray-Scale Morphological Filtering, IEEE Transactions on Image Processing,
4 (3): 387-391.
Mano M M (1991): Digital Design, Prentice-Hall, Englewood
Cliffs, NJ.
Maragos P, Schafer R W and Butt M A, eds., (1996): Mathematical
Morphology and its Applications to Image and Signal Processing, Kluwer
Academic Publishers, Dordrecht-Boston-London.
Park H and Chin R T (1994) Optimal Decomposition of Convex
Morphological Structuring Elements for 4-Connected Parallel Processors, IEEE
Transactions on Pattern Analysis and Machine Intelligence, 16 (3):
304-313.
Park H and Chin R T (1995) Decomposition Of Arbitrarily Shaped
Morphological Structuring Elements, IEEE Transactions on Pattern Analysis
and Machine Intelligence, 17 (1): 2-5.
Petkov N (1993): Systolic Parallel Processing, North-Holland,
Amsterdam, The Netherlands.
Pitas I and Venetsanopoulos A N (1990): Nonlinear Digital
Filters: Principles and Applications, Kluwer Academic Publishers, Boston,
Massachusetts, U.S.A.
Preston K J, Duff M J B, Levialdi S, Norgen P and Toriwaki J I
(1979) Basics of Cellular Logic With Some Applications in Medical Image
Processing, Proceedings of the IEEE, 67: 826-857.
Pu C C and Shih F Y (1995) Threshold Decomposition of Grey-Scale
Soft Morphology into Binary Soft Morphology, CVGIP-Graphical Models and
Image Processing, 57 (6): 522-526.
Reeves A P (1984) Parallel Computer Architectures for Image
Processing, Computer Vision, Graphics and Image Processing, 25
(1): 68-88.
Robin F, Renaudin M, Privat G and v.d. Bossche N (1996) Functionally
Asynchronous Array Processor for Morphological Filtering of Greyscale Images, IEE
Proceedings: Computers and Digital Techniques, 143 (5): 273-281.
Rosenfeld A (1983) Parallel Image Processing Using Cellular Arrays, IEEE
Computer, 16: 14-20.
Schmitt L A and Wilson S S (1988) The AIS-5000 Parallel Processor, IEEE
Transactions on Pattern Analysis and Machine Intelligence, 10 (3):
320-330.
Serra J (1982): Image Analysis and Mathematical Morphology,
Academic Press, London.
Serra J (1986) Introduction to Mathematical Morphology, Computer
Vision, Graphics, and Image Processing, 35 (3): 283-305.
Serra J, ed., (1989): Image Analysis and Mathematical Morphology:
Theoretical Advances, Academic Press, London.
Serra J and Soille P, eds., (1994): Mathematical Morphology and
its Applications to Image Processing, Kluwer Academic Publishers,
Dordrecht-Boston-London.
Shih F Y and Mitchell O R (1989) Threshold Decomposition of Gray-Scale
Morphology into Binary Morphology, IEEE Transactions on Pattern Analysis and
Machine Intelligence, 11 (1): 31-42.
Shih F Y and Mitchell O R (1991) Decomposition Of Grey-Scale
Morphological Structuring Elements, Pattern Recognition, 24 (3):
195-203.
Shinha D and Dougherty E R (1992) Fuzzy Mathematical Morphology, Journal
of Visual Communication and Image Representation, 3 (3): 286-302.
Singh B V and Siddiqi M U (1996) A Simplified Algorithm for
Approximate Separable Decomposition of Morphological Templates, Pattern
Recognition, 29 (9): 1519-1522.
Siskos S, Vlassis S and Pitas I (1998) Analog Implementation of Fast
Min-Max Filtering, IEEE Transactions on Circuits and Systems II: Analog and
Digital Signal Processing, 45 (7): 913-918.
Soille P (1999): Morphological Image Analysis: Principles and
Applications, Springer-Verlag, Berlin, Germany.
Urahama K and Nagao T (1995) Direct Analog Rank Filtering, IEEE
Transactions on Circuits and Systems I: Fundamental Theory and Applications,
42 (7): 385-388.
Vlassis S, Siskos S and Pitas I (1998) "Analog Implementation
of an Order Statistics Filter", 9th Mediterranean Electrotechnical
Conference, MELECON 98, 649 -653.
Wendt P D, Coyle E J and Gallagher Jr. N C (1986) Stack Filters, IEEE
Transactions on Acoustics Speech and Signal Processing, ASSP-34:
898-911.
Xu J (1996) Morphological Decomposition of 2-D Binary Shapes into
Simpler Shape Parts, Pattern Recognition Letters, 17 (7):759-769.
Zhuang X and Haralick R M (1986) Morphological Structuring Element Decomposition,
Computer Vision, Graphics and Image Processing, 35 (3): 370-382.