Computer Vision in Postal Automation
Giovani Garibotto
Elsag Bailey - TELEROBOT
Genova, Italy
Contents
- Postal Mechanization and Mail Processing
- Main functions involving computer Vision
- handwritten cursive recognition
- Man-machine interaction
- Material Handling and station loading/unloading
- Acquisition and Image processing
- Individual character recognition
- Localization
- Segmentation
- Character recognition
- Context analysis module.
- Input data
- Address Block Location: a short description of the process.
Introduction
Mail sorting and postal automation has always represented a main area of
application for Image processing and Pattern Recognition techniques.
Since the early developments in the first half of this century, postal
mechanization has grown a lot in the sixties and seventies pushed primarily by
the initiatives of the different national postal administrations. The need for
common standards in mail distribution forced international cooperation and more
recently, the removal of national barriers in the provision of mail services
has opened a worldwide competition involving all the most qualified companies
in Information Technology.
Since the mid of the seventies the escalating use of faxes and data transfer,
and more recently of e-mail, led to predictions that within 20 years relatively
few people will communicate by letter. In spite of such predictions mail volume
grows another 200 per cent in USA, to reach a level of about 177 billion pieces
a year.
Mail-handling is a very labour intensive process and labour costs has been
increasing during the last three decades. In addition to the cost factor, the
knowledge level required for the sorting process is quite considerable.
Mail has to be sorted for a large number of destinations. For important
destinations like large cities direct bundles are formed, but for small
villages mail is combined into bundles and dispatched to regional sorting
centers for further inward sorting. The tendency of most Postal Administrations
is the itroduction of new postcode containing information which could be used
for the entire mail-handling process. Just to give an example, we may refer to
the figures provided by the Royal PTT, in Nederland [1], as the service
objectives for the near future.
- 98 per cent of mail items smaller than 380 x 265 x 32 mm will be sorted
automatically.
- Larger items will be handled by so-called non-standard sorting machines.
- As many addresses as possible will be read automatically using OCR systems
(over 90% of all items are expected to be read in this way)
- Sorting will take place down to the level of an individual postman's
delivery route.
- Parcels will be handled in separate infrastructures which will be newly
constructed according to standardised design.
The flow-chart of fig 1 refers a network architecture proposed to manage in a
uniform way all information and sharing the appropriate resources for the mail
sorting process.Today large volume mailers increasingly demand faster, more
reliable service and customized products. They want day-certain delivery,
shipment and piece tracking, and an electronic data interface. Moreover the
most important administrations now compete with express mail service companies
like DHL, UPS, FedEX, newspapers, telecommunications companies, and alternative
delivery services.
A key function to success is the integration of the service in a network, by
connecting most of the plants with each other, with transportation suppliers,
mailer plants and postal customer.
Standardization of tools and processing interfaces is another fundamental
requirement for the next generation machines.
1.1 Mail Processing; statement of the problem and application requirements.
Automatic reading is necessary for all the address fields necessary for the
carrier to the final destination. The current mail flow is roughly sketched in
fig.2, from the collection of all mail items to the first office. All items are
singulated (culling) oriented and packed together (facing and cancelling) and
are sorted according to the post code.
At the delivery, destination office, two further sorting process are
implemented in order to obtain the final carrier sequencing of the mail.
Letter processing
In the following a brief description of the basic components is referred.
- A Culler Facer Canceller (CFC) machine is commonly used as a preprocessor.
It provides also image capturing to allow image processing while the physical
mail is being transported in the center.
- An OCR machine reads addresses written on the face of the mail and prints
a fluorescent bar code on the mail item.
- Should it be unable to recognize the address it will capture the mail
image and send it for on-line video coding.
- If the mail cannot be resolved by OCR or on-line coding the image will be
sent for off-line video coding.
- An off-line OCR and the video coding system will be used to process mail
images from the CFC.
- A Bar Code Sorter, BCS, machine, can presort mail whose images have been
resolved during either off-line video coding or off-line OCR. It can also
presort letters pre-printed with a bar-code by bulk mailers.
- A Delivery Bar Code Sorter DBCS operates on a two-pass sorting process by
sorting bar-coded letters to postman delivery routes in the first pass and to
delivery point sequence in the second pass.
In Appendix A, the functional
architecture of a typical letter reading process is referred.
The main function required in postal automation, involving Computer Vision is
definitely address reading and interpretation. In this sense it belongs to the
basic perceptual functions of biological vision. Nevertheless, due to the
inherent 2D nature of the problem its computer implementation is strongly based
on basic technologies as image processing and pattern recognition. At present,
the most challenging tasks, performed by such Vision technologies in postal
automation, are handwiting cursive recognition, flats handling and reading,
grey level and colour image processing, improved man-machine interaction, and
robotic material handling.
The new frontiers in Handwriting recognition make extensive use of the context
to achieve free address reading, using both large vocabulary and grammar
constraints as well as euristics. A first objective which is already
implemented in the most advanced installation consists in reading the last line
including the ZIP code, city name and state and integrate such information in
order to improve the reliability of the system and minimize the use of off-line
video coding. Next generation machines will include also the capability to read
the full handwritten address line, with street name and civic number, in order
to manage the final postman's delivery. Postal addresses can now be read with
average rates of about 40% of correct reading on the global address, including
street and home address, with an error rate of about 1%.
2.2 Flats sorting machines
Flats reading represent the most challenging objective for postal sorting
machines, beside more conventional letter manipulation. There are different
categories of flats to be handled. One class includes large A4 size envelopes
with more or less additional information printed on it (sender and destination
address, advertisment messages, stamps and mail class service information).
Another class consists of journals, newspapers and catalogs with or without
plastic cover. Moreover in the flat category are often included also small
parcels with a maximum thickness of about 40 mm.
One of the main problems comes from the management of plastic cover which
prevents a sharp and well contrasted acquisition from the input vision sensor.
Improvements in the acquisition process (both in resolution and dynamic range)
as well as adaptive grey level image processing tools, represent key factors to
solve this problem. Appendix B refers a processing scheme and the current main
problems in flat sorting machines.
There is large room of improvement in Man-machine interaction tools for video
coding, which represent an increasingly important component of the system,
expecially in the new trend to increase off-line and remote coding by system
operators. Ergonomic problems, resolution of the video display, quick and easy
pan and scroll of the image on the screen are priority issues. But also more
long term problems referred to tracking of the human eye and simplified data
entry shall be taken into account in the new generation of postal automation
systems.
2.4 Parcel classification
Huge and complex machines are currently used for 3D parcel sorting. State of
the art sorting equipment can handle some 250,000 items per day.
Presently parcel processing is highly labour intensive; in fact parcels are
introduced manually by human operators and during this input stage a preliminar
classification is already performed in terms of size, and shape (tubes, regular
and irregular packets, etc.). They are also labelled with some ID code label
and the operators place the parcel item on the sorter tray with the label side
facing up. At the input stage an overhead scanner is installed to automatically
read the label and assign the destination information to the sorter tray.
During the last 10 years there has been a significant research effort carried
out by the most important Postal Administration, to automatize the parcel input
stage and reduce the cost associated to this very low-professional work. The
problem here is a clearly 3D object recognition task as well as a
classification into the three main categories (tubes and cylindrical items,
thick and polyhedrical parcels, irregular shapes). Quite interesting results
using Computer Vision technology have been achieved in the late eighties, with
the realization of prototype systems which made use of active light laser
sensors to recover the 3D shape of the parcels and allow a pre-sorting of the
mail items.
The extremely limited success of this technology in practical applications is
primarily due to the following reasons:
- Reliability problems and not competitive input speed.
- The range of adaptability required by the Vision system in order to find and
read the address label onto the parcel, if no a-priori manual ID-label
application is performed.
- The classification of the parcels is often insufficient to automatize the
input stage. A human operator would be required anyway to place the ID-label
and orient the parcel with such label facing-up.
Actually a successful use of this technology may happen only with a deep
reorganization of the whole handling and sorting approach and this may be the
reason for the limited success of automatic parcel classification.
One of the most intensive tasks to be performed in a large mail distribution
center is the transport of mail items between different machines and from/to
the input/output stage of the center. From recent studies it has been
demonstrated that such transport service is carried out for about 80% through
the use of trolleys pushed by human operators and just 20% is managed by
electric trucks (mainly driven again by human operators).
The available technologies to solve such problem of transportation between
different working cells (intercell service) are:
- Electrical trucks, often used to tow a convoy of passive trolleys, with
evident problems of manouver encumbrance and requirement of human driving.
- Rail transport system with the well known disadvantages of fixed
installations and no flexibility in the management of the mail distribution
center.
- Transport rollers are efficiently used for point-to-point service, but again
there are strong limitations due to space occupancy and no flexibility.
- AGVs are also used using both conventional inductive guide approaches or new
generation machines using laser for self-orientation. The main limitations are
the impossibility to interact with human driven machines which will be
operating in the same area, all the time.
It is obvious that new generation mobile robots, based on advanced sensor
capability able to freely navigate, avoid static and moving obstacle, able to
negotiate at crossing point with other vehicles, both automatic and manually
driven, represent the real solution. In this domain Computer Vision should play
a fundamental role to give flexibility and intelligence to the robotic logistic
system.
3) A projection of future trends and growth rates
The previous section has already referred the main direction of research and
investment aimed to improve the performance of mail processing systems in the
field of handwriting recognition, address block location, parcel processing,
etc. Moreover there is a great effort to improve the efficiency of the service
through the realization of distributed architectures which may provide remote
access and obtain a wider access to the processing resources (both
geographically and logically).
In the following we try to focus on new emerging services which represent the
new frontiers for competition and system providers in order to enhance the
level of service provided and increase the added value to the final customer.
The mail process has been primarily considered as an end-to-end paper mail
service. Over the last few years, the rapid growth of computer and
communication technology has led to a corresponding rapid growth of an
end-to-end electronic mail, especially in th business sector On the other hand
the integration of different technologies can provide excellent opportunites
for new postal services as hybrid mail, as an example of electronic to paper
mail service. In this case large mailers can produce and forward messages in
electronic form and use the distributed postal network for the printing and
delivery of the mail.
An example of a new product based on paper-to-electronic category is the Order
reply Mail. This service allows the interception of all letters or postcard
addressed to a specific customer (i.e. a mail order company) automatic
electronic reading of the content (handwritten information) and its translation
into frameworks suitable for computer processing, its decoding and delivery to
the customer using data transmission networks. By increasing the content to be
printed or handwritten onto the mail piece the reading process has increased
from pure address recognition to full document analysis and interpretation,
stamp and mail class identification, etc.
4. Conclusions
In spite of the superficial analysis carried out in this report, it is possible
to conclude that most relevant contributions to the improvemnent of postal
services are expected from real-time, large format Image processing technology
and from Pattern Recognition, including high level context analysis and
processing. Actually Mail Automation is implemented through the integration of
different disciplines, involving almost all Information technologies.
Computer Vision, in the classical definition accepted within the ECVnet
community, as an intelligent tool to deal with 3D scene, is definitely an
enabling technology in this sector, mainly focussed to support robotic
applications in handling and logistics services. As such the possible
exploitation effort from the Computer Vision community should be addressed to
find solutions for Mail processing primarily for:
- 3D object recognition and classification (parcel sorting)
- Pose recognition for standard trays handling in loading/unloading stations.
- Typical issues of Vision based autonomous navigation (Self-localization,
obstacle detection, docking) to provide the required flexibility and
free-navigation capability in crowded environment, for inter-cell
transportation logistic services.
References
[1] Postal Technology International `96, UK & Int. Press, ISSN 1362-5209,
[2] Proceedings of the Third International Conference on Document Analysis and
Recognition, IEEE Computer Society Press, Aug. 1995.
[3] Proceedings of the Advanced Technology Conference, USPS, 1992,
[4] B.Belkacem, "Une Application Industrielle de Reconnaissance d'addresses",
4eme Colloque National sur l'Ecrit et le Document, CNED'96, Nantes, July
1996.
Appendix A
Functional Architecture of a postal address reader. In a schematic way a system
for address reading may be described as a system for data compression from the
raw data coming at 8 pixel/mm and 256 grey levels, (something about 2 Mbytes
data) up to a few bytes, corresponding to the content of the postal
destination. It consists essentially of 3 modules:
- Acquisition and Image processing
- Segmentation and recognition of individual characters in the mail piece
- Context analysis and address recognition.
The objective is the compression of the grey level image to a binary image.
This represents a fundamental step of the process, since any part of
information lost at this stage cannot be recovered any more later. Moreover, an
adaptability of the system is recommended. In fact some of the postal items are
well contrasted, but others have poor contrast and limited reading
possibilities.
A further strong requirement is real-time processing, due to the high speed of
the mail passing through in front of the input sensor (about 17
letters/second)
The input of this module is the bit-map from the single letter image and it
consists of the following steps.
The objective of this processing step is to identify:
- text lines (both in the case of hanwritten or typewritten text)
- Other geometric or information features to be detected in the mail item
(stamp, codes, etc.)
This stage represents a classical binary preprocessing (using clustering
techniques and morphological processing tools).
From the localized block of text it is necessary to segment the individual
characters for the following recognition. The major problems here are the
correct segmentation of touching characters expecially for handwritten text
(both numerals and alphabetic characters)..
The literature of character recognition is extremely wide and rich, (see [2])
including the use of feature based statistical approaches, a variety of pattern
matching schemes and a combination of neural network techniques.
To improve the performance of handwritten character recognition it is quite
common to use the following schemes:
to manage multi hypothesis until the end of the process, avoiding an early
pruning of the decision tree.
to use a combination in parallel or sequential order of different (possibly
uncorrelated) character recognition techniques (statistical, neural, etc)
a combination of character recognition methods on pair of consecutive
characters, rather than on the individual segmented ones.
This module is relevant not only for the obvious objective to minimize the
error rate of the address, but also because there are often writing errors in
the original address (it is more than 10% in the U.K).
This module takes into account the following information:
a catalog which describes all the spectrum of the expected addresses
(including some possible error)
some coding rules which describes how the address is supposed to be
arranged.
Different solutions are usually implemented for typewritten or handwritten
context analysis, since different fields with different content are typically
involved.
Ultimately the performance achieved by the context analysis code is
significantly better than the possible result from a simple code reader as
depicted in the following table [4]
Reading rate Rejection rate
Typewritten code reader 72% 92% 1.6% 0.5% context address
Handwritten code reader 62% 69% 1.8% 0.9% context address
Appendix B
Address Block Location in Flat sorting.
This section describes the essential features of a module for address block
location, in flat sorting machines.
They consist of images of grey levels of size 2000 x 2000 pixels or more.
Visual criteria and the information content of such mail pieces are briefly
summarized in the following
The address block to look for is composed of dark ink characters on a lighter
background (either a white label or the gray colour of the envelope)
The format and size of the characters is arbitrary and cannot be established
a-priori, expecially for handwritten addresses. It is anyway smaller than other
printed material present on the flat.
The address lines do not have a fixed known direction, although typewritten
text is mostly horizontal or vertical (unless for the free labels inserted into
plastic envelopes).
The most critical noise source comes from plies or folds on the surface
(expecially for plastic wrapping). Other useless information to be discarded
are headlines, patches of text, photos, graphics, etc, with a large variety of
colours.
1) Preprocessing and noise removal.
The objective of this stage is image enhancement to remove input noise and
minimize the effect of intereferences (like light reflections or plies and
other overlapped noise structures).
2) Multiresolution Region of Interest analysis.
Data reduction is the primary objective, as well as to achieve a more
efficient data representation to find out some candidate regions for the
address block. The detection of regular and repetitive patterns, local measures
of density and frequency, blob analysis, are common tools used at this stage of
the process.
3) Segmentation of block candidates
Geometrical constraints are commonly used to segment and isolate some blocks
and rank them on the basis of similarity measures with respect to some address
prototype model. The number of candidate lines (horizontal or vertical), their
alignment (left or wright), the size and shape of the block are used as
discriminating features.
4) Context analysis
Topological constraints as well as euristic criteria are used to classify the
detected blocks and decide on their arrangement onto the mail piece. It is
worthwhile to remind that these stages of the process are usually carried out
at lower resolution where text cannot be recognized and interpreted any more.
Many research results on this subject, as well on the other topics of mail
process automation can be found on the annual proceedings of the USPS, United
States Postal Service [3].
In the previous scheme we have described a traditional forward processing
approach, but the present availability of ever increasing powerful processing
units allows the exploitation of feedback information and a better tuning of
the processing parameters, adapted to the available results.