Computer Vision in Postal Automation

Giovani Garibotto
Elsag Bailey - TELEROBOT
Genova, Italy

Postal Mechanization and Mail Processing
Main functions involving computer Vision
Acquisition and Image processing
Individual character recognition
Context analysis module.
1. Input data
2. Address Block Location: a short description of the process.

Introduction

Mail sorting and postal automation has always represented a main area of application for Image processing and Pattern Recognition techniques.

Since the early developments in the first half of this century, postal mechanization has grown a lot in the sixties and seventies pushed primarily by the initiatives of the different national postal administrations. The need for common standards in mail distribution forced international cooperation and more recently, the removal of national barriers in the provision of mail services has opened a worldwide competition involving all the most qualified companies in Information Technology.

Since the mid of the seventies the escalating use of faxes and data transfer, and more recently of e-mail, led to predictions that within 20 years relatively few people will communicate by letter. In spite of such predictions mail volume grows another 200 per cent in USA, to reach a level of about 177 billion pieces a year.

Postal Mechanization and Mail Processing

Mail-handling is a very labour intensive process and labour costs has been increasing during the last three decades. In addition to the cost factor, the knowledge level required for the sorting process is quite considerable.

Mail has to be sorted for a large number of destinations. For important destinations like large cities direct bundles are formed, but for small villages mail is combined into bundles and dispatched to regional sorting centers for further inward sorting. The tendency of most Postal Administrations is the itroduction of new postcode containing information which could be used for the entire mail-handling process. Just to give an example, we may refer to the figures provided by the Royal PTT, in Nederland [1], as the service objectives for the near future.

98 per cent of mail items smaller than 380 x 265 x 32 mm will be sorted automatically.
Larger items will be handled by so-called non-standard sorting machines.
As many addresses as possible will be read automatically using OCR systems (over 90% of all items are expected to be read in this way)
Sorting will take place down to the level of an individual postman's delivery route.
Parcels will be handled in separate infrastructures which will be newly constructed according to standardised design.

The flow-chart of fig 1 refers a network architecture proposed to manage in a uniform way all information and sharing the appropriate resources for the mail sorting process.Today large volume mailers increasingly demand faster, more reliable service and customized products. They want day-certain delivery, shipment and piece tracking, and an electronic data interface. Moreover the most important administrations now compete with express mail service companies like DHL, UPS, FedEX, newspapers, telecommunications companies, and alternative delivery services.

A key function to success is the integration of the service in a network, by connecting most of the plants with each other, with transportation suppliers, mailer plants and postal customer.

Standardization of tools and processing interfaces is another fundamental requirement for the next generation machines.

1.1 Mail Processing; statement of the problem and application requirements.

Automatic reading is necessary for all the address fields necessary for the carrier to the final destination. The current mail flow is roughly sketched in fig.2, from the collection of all mail items to the first office. All items are singulated (culling) oriented and packed together (facing and cancelling) and are sorted according to the post code.

At the delivery, destination office, two further sorting process are implemented in order to obtain the final carrier sequencing of the mail.

Letter processing

In the following a brief description of the basic components is referred.

A Culler Facer Canceller (CFC) machine is commonly used as a preprocessor. It provides also image capturing to allow image processing while the physical mail is being transported in the center.
An OCR machine reads addresses written on the face of the mail and prints a fluorescent bar code on the mail item.
Should it be unable to recognize the address it will capture the mail image and send it for on-line video coding.
If the mail cannot be resolved by OCR or on-line coding the image will be sent for off-line video coding.
An off-line OCR and the video coding system will be used to process mail images from the CFC.
A Bar Code Sorter, BCS, machine, can presort mail whose images have been resolved during either off-line video coding or off-line OCR. It can also presort letters pre-printed with a bar-code by bulk mailers.
A Delivery Bar Code Sorter DBCS operates on a two-pass sorting process by sorting bar-coded letters to postman delivery routes in the first pass and to delivery point sequence in the second pass.

In Appendix A, the functional architecture of a typical letter reading process is referred.

Main functions involving computer Vision

The main function required in postal automation, involving Computer Vision is definitely address reading and interpretation. In this sense it belongs to the basic perceptual functions of biological vision. Nevertheless, due to the inherent 2D nature of the problem its computer implementation is strongly based on basic technologies as image processing and pattern recognition. At present, the most challenging tasks, performed by such Vision technologies in postal automation, are handwiting cursive recognition, flats handling and reading, grey level and colour image processing, improved man-machine interaction, and robotic material handling.

handwritten cursive recognition

The new frontiers in Handwriting recognition make extensive use of the context to achieve free address reading, using both large vocabulary and grammar constraints as well as euristics. A first objective which is already implemented in the most advanced installation consists in reading the last line including the ZIP code, city name and state and integrate such information in order to improve the reliability of the system and minimize the use of off-line video coding. Next generation machines will include also the capability to read the full handwritten address line, with street name and civic number, in order to manage the final postman's delivery. Postal addresses can now be read with average rates of about 40% of correct reading on the global address, including street and home address, with an error rate of about 1%.

2.2 Flats sorting machines

Flats reading represent the most challenging objective for postal sorting machines, beside more conventional letter manipulation. There are different categories of flats to be handled. One class includes large A4 size envelopes with more or less additional information printed on it (sender and destination address, advertisment messages, stamps and mail class service information). Another class consists of journals, newspapers and catalogs with or without plastic cover. Moreover in the flat category are often included also small parcels with a maximum thickness of about 40 mm.

One of the main problems comes from the management of plastic cover which prevents a sharp and well contrasted acquisition from the input vision sensor. Improvements in the acquisition process (both in resolution and dynamic range) as well as adaptive grey level image processing tools, represent key factors to solve this problem. Appendix B refers a processing scheme and the current main problems in flat sorting machines.

Man-machine interaction

There is large room of improvement in Man-machine interaction tools for video coding, which represent an increasingly important component of the system, expecially in the new trend to increase off-line and remote coding by system operators. Ergonomic problems, resolution of the video display, quick and easy pan and scroll of the image on the screen are priority issues. But also more long term problems referred to tracking of the human eye and simplified data entry shall be taken into account in the new generation of postal automation systems.

2.4 Parcel classification

Huge and complex machines are currently used for 3D parcel sorting. State of the art sorting equipment can handle some 250,000 items per day.

Presently parcel processing is highly labour intensive; in fact parcels are introduced manually by human operators and during this input stage a preliminar classification is already performed in terms of size, and shape (tubes, regular and irregular packets, etc.). They are also labelled with some ID code label and the operators place the parcel item on the sorter tray with the label side facing up. At the input stage an overhead scanner is installed to automatically read the label and assign the destination information to the sorter tray.

During the last 10 years there has been a significant research effort carried out by the most important Postal Administration, to automatize the parcel input stage and reduce the cost associated to this very low-professional work. The problem here is a clearly 3D object recognition task as well as a classification into the three main categories (tubes and cylindrical items, thick and polyhedrical parcels, irregular shapes). Quite interesting results using Computer Vision technology have been achieved in the late eighties, with the realization of prototype systems which made use of active light laser sensors to recover the 3D shape of the parcels and allow a pre-sorting of the mail items.

The extremely limited success of this technology in practical applications is primarily due to the following reasons:

Reliability problems and not competitive input speed.
The range of adaptability required by the Vision system in order to find and read the address label onto the parcel, if no a-priori manual ID-label application is performed.
The classification of the parcels is often insufficient to automatize the input stage. A human operator would be required anyway to place the ID-label and orient the parcel with such label facing-up.

Actually a successful use of this technology may happen only with a deep reorganization of the whole handling and sorting approach and this may be the reason for the limited success of automatic parcel classification.

Material Handling and station loading/unloading

One of the most intensive tasks to be performed in a large mail distribution center is the transport of mail items between different machines and from/to the input/output stage of the center. From recent studies it has been demonstrated that such transport service is carried out for about 80% through the use of trolleys pushed by human operators and just 20% is managed by electric trucks (mainly driven again by human operators).

The available technologies to solve such problem of transportation between different working cells (intercell service) are:

Electrical trucks, often used to tow a convoy of passive trolleys, with evident problems of manouver encumbrance and requirement of human driving.
Rail transport system with the well known disadvantages of fixed installations and no flexibility in the management of the mail distribution center.
Transport rollers are efficiently used for point-to-point service, but again there are strong limitations due to space occupancy and no flexibility.
AGVs are also used using both conventional inductive guide approaches or new generation machines using laser for self-orientation. The main limitations are the impossibility to interact with human driven machines which will be operating in the same area, all the time.

It is obvious that new generation mobile robots, based on advanced sensor capability able to freely navigate, avoid static and moving obstacle, able to negotiate at crossing point with other vehicles, both automatic and manually driven, represent the real solution. In this domain Computer Vision should play a fundamental role to give flexibility and intelligence to the robotic logistic system.

3) A projection of future trends and growth rates

The previous section has already referred the main direction of research and investment aimed to improve the performance of mail processing systems in the field of handwriting recognition, address block location, parcel processing, etc. Moreover there is a great effort to improve the efficiency of the service through the realization of distributed architectures which may provide remote access and obtain a wider access to the processing resources (both geographically and logically).

In the following we try to focus on new emerging services which represent the new frontiers for competition and system providers in order to enhance the level of service provided and increase the added value to the final customer.

The mail process has been primarily considered as an end-to-end paper mail service. Over the last few years, the rapid growth of computer and communication technology has led to a corresponding rapid growth of an end-to-end electronic mail, especially in th business sector On the other hand the integration of different technologies can provide excellent opportunites for new postal services as hybrid mail, as an example of electronic to paper mail service. In this case large mailers can produce and forward messages in electronic form and use the distributed postal network for the printing and delivery of the mail.

An example of a new product based on paper-to-electronic category is the Order reply Mail. This service allows the interception of all letters or postcard addressed to a specific customer (i.e. a mail order company) automatic electronic reading of the content (handwritten information) and its translation into frameworks suitable for computer processing, its decoding and delivery to the customer using data transmission networks. By increasing the content to be printed or handwritten onto the mail piece the reading process has increased from pure address recognition to full document analysis and interpretation, stamp and mail class identification, etc.

4. Conclusions

In spite of the superficial analysis carried out in this report, it is possible to conclude that most relevant contributions to the improvemnent of postal services are expected from real-time, large format Image processing technology and from Pattern Recognition, including high level context analysis and processing. Actually Mail Automation is implemented through the integration of different disciplines, involving almost all Information technologies.

Computer Vision, in the classical definition accepted within the ECVnet community, as an intelligent tool to deal with 3D scene, is definitely an enabling technology in this sector, mainly focussed to support robotic applications in handling and logistics services. As such the possible exploitation effort from the Computer Vision community should be addressed to find solutions for Mail processing primarily for:

3D object recognition and classification (parcel sorting)
Pose recognition for standard trays handling in loading/unloading stations.
Typical issues of Vision based autonomous navigation (Self-localization, obstacle detection, docking) to provide the required flexibility and free-navigation capability in crowded environment, for inter-cell transportation logistic services.

References

[1] Postal Technology International `96, UK & Int. Press, ISSN 1362-5209,

[2] Proceedings of the Third International Conference on Document Analysis and Recognition, IEEE Computer Society Press, Aug. 1995.

[3] Proceedings of the Advanced Technology Conference, USPS, 1992,

[4] B.Belkacem, "Une Application Industrielle de Reconnaissance d'addresses", 4eme Colloque National sur l'Ecrit et le Document, CNED'96, Nantes, July 1996.

Appendix A

Functional Architecture of a postal address reader. In a schematic way a system for address reading may be described as a system for data compression from the raw data coming at 8 pixel/mm and 256 grey levels, (something about 2 Mbytes data) up to a few bytes, corresponding to the content of the postal destination. It consists essentially of 3 modules:

Acquisition and Image processing
Segmentation and recognition of individual characters in the mail piece
Context analysis and address recognition.

Acquisition and Image processing

The objective is the compression of the grey level image to a binary image. This represents a fundamental step of the process, since any part of information lost at this stage cannot be recovered any more later. Moreover, an adaptability of the system is recommended. In fact some of the postal items are well contrasted, but others have poor contrast and limited reading possibilities.

A further strong requirement is real-time processing, due to the high speed of the mail passing through in front of the input sensor (about 17 letters/second)

Individual character recognition

The input of this module is the bit-map from the single letter image and it consists of the following steps.

Localization

The objective of this processing step is to identify:

text lines (both in the case of hanwritten or typewritten text)
Other geometric or information features to be detected in the mail item (stamp, codes, etc.)

This stage represents a classical binary preprocessing (using clustering techniques and morphological processing tools).

Segmentation

From the localized block of text it is necessary to segment the individual characters for the following recognition. The major problems here are the correct segmentation of touching characters expecially for handwritten text (both numerals and alphabetic characters)..

Character recognition

The literature of character recognition is extremely wide and rich, (see [2]) including the use of feature based statistical approaches, a variety of pattern matching schemes and a combination of neural network techniques.

To improve the performance of handwritten character recognition it is quite common to use the following schemes:

to manage multi hypothesis until the end of the process, avoiding an early pruning of the decision tree.

to use a combination in parallel or sequential order of different (possibly uncorrelated) character recognition techniques (statistical, neural, etc)

a combination of character recognition methods on pair of consecutive characters, rather than on the individual segmented ones.

Context analysis module.

This module is relevant not only for the obvious objective to minimize the error rate of the address, but also because there are often writing errors in the original address (it is more than 10% in the U.K).

This module takes into account the following information:

a catalog which describes all the spectrum of the expected addresses (including some possible error)

some coding rules which describes how the address is supposed to be arranged.

Different solutions are usually implemented for typewritten or handwritten context analysis, since different fields with different content are typically involved.

Ultimately the performance achieved by the context analysis code is significantly better than the possible result from a simple code reader as depicted in the following table [4]

				Reading rate	Rejection rate     
Typewritten	code reader	72%	92%	1.6%	0.5%	context address
Handwritten	code reader	62%	69% 	1.8%	0.9%	context address

Appendix B

Address Block Location in Flat sorting.

This section describes the essential features of a module for address block location, in flat sorting machines.

Input data

They consist of images of grey levels of size 2000 x 2000 pixels or more. Visual criteria and the information content of such mail pieces are briefly summarized in the following

The address block to look for is composed of dark ink characters on a lighter background (either a white label or the gray colour of the envelope)

The format and size of the characters is arbitrary and cannot be established a-priori, expecially for handwritten addresses. It is anyway smaller than other printed material present on the flat.

The address lines do not have a fixed known direction, although typewritten text is mostly horizontal or vertical (unless for the free labels inserted into plastic envelopes).

The most critical noise source comes from plies or folds on the surface (expecially for plastic wrapping). Other useless information to be discarded are headlines, patches of text, photos, graphics, etc, with a large variety of colours.

Address Block Location: a short description of the process.

1) Preprocessing and noise removal.

The objective of this stage is image enhancement to remove input noise and minimize the effect of intereferences (like light reflections or plies and other overlapped noise structures).

2) Multiresolution Region of Interest analysis.

Data reduction is the primary objective, as well as to achieve a more efficient data representation to find out some candidate regions for the address block. The detection of regular and repetitive patterns, local measures of density and frequency, blob analysis, are common tools used at this stage of the process.

3) Segmentation of block candidates

Geometrical constraints are commonly used to segment and isolate some blocks and rank them on the basis of similarity measures with respect to some address prototype model. The number of candidate lines (horizontal or vertical), their alignment (left or wright), the size and shape of the block are used as discriminating features.

4) Context analysis

Topological constraints as well as euristic criteria are used to classify the detected blocks and decide on their arrangement onto the mail piece. It is worthwhile to remind that these stages of the process are usually carried out at lower resolution where text cannot be recognized and interpreted any more.

Many research results on this subject, as well on the other topics of mail process automation can be found on the annual proceedings of the USPS, United States Postal Service [3].

In the previous scheme we have described a traditional forward processing approach, but the present availability of ever increasing powerful processing units allows the exploitation of feedback information and a better tuning of the processing parameters, adapted to the available results.