A communiqué from Joe Lubenow describes the publication of International Postal Address Components and Templates by the Universal Postal Union (UPU) as Standard S42-1. The specification had been approved by the UPU Standards Board for further testing in November, 2002. As summarized in Lubenow's overview document, the UPU address standard defines elements, address templates, and rendition instructions.
Based upon CEN's "Components of Postal Addresses" specification the UPU standard defines a comprehensive list of name and address elements corresponding to "the smallest meaningful parts of names and addresses. This set of elements has been extended as necessary to cover additional situations, but so far has been sufficient to represent names and addresses in a number of non-European countries, including the US, taking account of some terminological differences." The UPU address templates define "unique combinations and orderings of elements, or in more general terms, address types, within a country. Templates in UPU S42 are described both in natural language and using an XML format known as the Postal Address Template Description Language (PATDL). Rendition instructions govern "the production of addresses on an output medium such as an address label or a computer display screen. Included in the standard is a registry of rendition instructions, which can be formatting rules for final presentation, including abbreviation and prioritization of data elements when there are constraints on available space."
An IDEAlliance ADIS work group is developing an implementation of the UPU S42 standard within the broader context of business mail; ADIS allows address data to be described in XML and provides additional user options/extensions, while supporting all the elements of UPU S42 and the PATDL template description language.
Bibliographic Information
International Postal Address Components and Templates. Universal Postal Union (UPU). Data Definition and Encoding Standards. Reference: S42-1. 78 pages. Original publication (UPU Status 0): 4-November-2002. Approved: April 2003. The UPU S42-1 standard is based in part upon the CEN TC 331/WI 015 Postal services -- Address data bases -- Part 1: Components of Postal Addresses. Print update expected mid-2003. Available as part of the UPU Technical Standards collection.
Extract from "Overview: UPU S42 Standard on International Postal Address Components and Templates"
Joe Lubenow supplied an overview, here quoted in part:
The [UPU S42-1] standard is based upon a comprehensive list of name and address elements that originated in the work of the European standardization organization CEN, which has an agreement with the UPU to work together on postal standards. These elements define the smallest meaningful parts of names and addresses. The set of elements is extended as necessary to cover additional situations, but so far has been sufficient to represent names and addresses in a number of non-European countries, including the US, taking account of some terminological differences.
A second major concept within UPU S42 is the address template, which describes unique combinations and orderings of elements, or in more general terms, address types, within a country. Templates in UPU S42 are described both in natural language and using an XML format known as the Postal Address Template Description Language (PATDL). PATDL supports multiple sub-templates with branching based on field values, business rules, decision tables, or other defined algorithms. Templates refer to elements by their names or by using codes assigned by the UPU, and can also utilize externally defined elements or code sets. Templates for some countries, such as the United Kingdom, are substantially more complex than for others, such as the United States. By using the templates, the names and addresses can be stored in a permanently parsed format and reconstituted when necessary according to the requirements of a specific situation.
The third major part of the standard has to do with rendition, or the production of addresses on an output medium such as an address label or a computer display screen. Included in the standard is a registry of rendition instructions, which can be formatting rules for final presentation, including abbreviation and prioritization of data elements when there are constraints on available space, and upstream procedures designed to govern the rendition process as a whole, to decide among alternatives, or to implement user preferences. A simple example of rendition instructions is the formatting of a postal code, while a more complex example is the movement of apartment information as recommended by the USPS to the line above the street address as an alternative to abbreviating or shortening the street name or omitting the apartment information.
It turns out that within a template an element such as the UPU "thoroughfare qualifier" may have multiple occurrences in different positions, such as pre-directionals and postdirectionals in US addresses, and other elements such as the UPU "postcode" need to be divided into parts in order to be properly rendered, such as the US ZIP+4 code with its hyphen after the first five digits. These situations have in common that they raise issues of cardinality not dealt with in the list of elements itself. The POST*Code group agreed to define element sub-types in order to handle the issue of cardinality in both forms by making it possible to represent any multiplicity or subdivision of elements in the templates. These element sub-types are explicitly defined within the standard as the need for them is recognized.
Through surveys and discussions at the UPU, it has been learned that at least twenty countries either have or are developing a delivery point database. By this is meant a full definition of the specific addresses to which deliveries are made, without resort to summaries, range files, or other methods that cause loss of information about whether a certain set of address elements represents a complete and correct address. Without such a database, the technology that the UPU standard facilitates can only distinguish between addresses that might be valid and those that are definitely invalid. But with a delivery point database, the same technology can distinguish between the addresses that are valid and those that are invalid. Current approaches to address maintenance typically store composite address lines and cannot always make those key distinctions correctly because they require an additional step of parsing the address elements and mapping them to the database fields, which can fail if there are extraneous, misplaced or ambiguous elements.
The standard needs additional testing and development of more templates before it can be utilized on a worldwide basis. Currently fourteen countries have agreed to participate and half of those countries have provided mappings of elements, natural language templates, basic rendition rules, and sample addresses representing the known address types that involve different orderings of elements. There are two approaches to deriving the formal PATDL XML template from the inputs provided, which can be deployed separately or together. One is to translate directly from the rules implicit in the natural language template, and this works if that template is sufficiently precise and complete. The other is to generalize upward from the sample addresses to find a PATDL template that can generate all the renditions correctly, and this works if the sample is robust enough. Actually some combination of deductive and inductive approaches is needed in order to ensure that the template is capable of accomplishing the objective of properly formatting all valid addresses for the country. Thereafter any template may be further elaborated and customized with options and user preferences. Both the natural language and PATDL templates will be published by the UPU if appropriate approvals have been obtained.
Overview of UPU S42-1 International Postal Address Components and Templates
"This UPU standard provides a dictionary of the possible components of postal addresses, together with examples of and constraints on their use. The standard defines three conceptual levels of postal address component: (1) elements, such as organization name or legal status, which correspond to the lowest level of component which it may be useful to distinguish in address representations; (2) constructs, such as organization identification, which group elements into units which are more meaningful for human interpretation; (3) segments, such as addressee specification, which correspond to major logical portions of a postal address. To cover multiple occurrences and locations of elements in an address, the standard defines a fourth level... Postal address components: [as described in Section 5] the standard defines the decomposition of a postal address specification into segments, constructs and elements... A postal address specification comprises one to four segments: (1) an addressee specification [optional]; (2) a mailee specification [optional]; (3) recipient dispatching information [optional]; (4) a delivery point specification [mandatory]. Segments are built up from postal address constructs and elements..."
[Background and rationale:] "... increasing volumes and labour cost rates long ago reached the point at which automation became not only economic, but essential. As a result, it has become more and more vital to ensure that the vast majority of postal items are addressed in a way which can be processed automatically, without risk of misinterpretation. Today, the vast majority of postal items carry printed addresses which are extracted from computer databases. Such databases need to be maintained in the face of population mobility, creation and suppression of delivery points and changes in their specification such as renaming of streets, renumbering of properties, etc. Moreover, there is a growing tendency for companies to exchange or trade address data and, in the context of the European Single Market, for companies in one country to hold address data of organizations and individuals in other countries, which may use different approaches to the structuring of printed addresses. In this context, the UPU Postal Operations Council's POST*Code Project Team charged its sub-project team 2 to develop a standard, covering the definition of address components and postal address templates. This standard, International postal address components and templates, is the result of this development..."
"Postal services form part of the daily life of people all over the world. The Universal Postal Union (UPU) is the specialized institution of the United Nations that regulates the universal postal service. The postal services of its 189 member countries form the largest physical distribution network in the world. Some 6.2 million postal employees working in over 700,000 post offices all over the world handle an annual total of 430 billion letters, printed matter and parcels in the domestic service and almost 10 billion letters, printed matter and parcels in the international service. Keeping pace with the changing communications market, posts are increasingly using new communication and information technologies to move beyond what is traditionally regarded as their core postal business. They are meeting higher customer expectations with an expanded range of products and value-added services."
"Standards are important prerequisites for effective postal operations and for interconnecting the global network. The UPU's Standards Board develops and maintains a growing number of standards to improve the exchange of postal-related information between posts, and promotes the compatibility of UPU and international postal initiatives. It works closely with posts, customers, suppliers and other partners, including various international organizations. The Standards Board ensures that coherent standards are developed in areas such as electronic data interchange (EDI), mail encoding, postal forms and meters."
[Adapted from the Introduction, Foreword. and Scope Statement]
Summary of UPU International Bureau Design for Address Elements
"The International Bureau is currently compiling a list of all the elements that may be included in international addresses, on the basis of model addresses collected from all the UPU member countries. Once this work has been completed, we shall include a list of international address elements in this chapter. This list will include, for each element: (1) the unique code for the element; (2) the name of the element; (3) the definition of the element; (4) specific examples from several different countries, showing different ways of using the element. This list will form an integral part of a UPU standard that includes, for each country, a formal description of the structure of the country's model addresses. This formal description, supplied in natural language and in other languages (XML, etc.) will therefore include, for each country and each address type: [1] the position of each address element on a line; [2] information on whether the element is mandatory or optional; [3] layout rules describing, amongst other things, the conditions under which a logical line may be divided into two physical lines, or under which two logical lines may be compressed into one physical line, when words that may be abbreviated, etc. The formal description of the structures of countries' model addresses will... be the subject of a separate publication within the framework of the UPU standards. A first version of this standard is planned for November 2002..." [adapted 2003-06-16 from the slightly dated online text, referenced from the UPU page on International Addressing.]
Related Publication: US FGDC Address Data Content Standard
The US Federal Geographic Data Committee is "a 19 member interagency committee composed of representatives from the Executive Office of the President, Cabinet-level and independent agencies. The FGDC is developing the National Spatial Data Infrastructure (NSDI) in cooperation with organizations from State, local and tribal governments, the academic community, and the private sector. The NSDI encompasses policies, standards, and procedures for organizations to cooperatively produce and share geographic data." FGDC announced an open review for a draft Address Data Content Standard in April 2003:
"Address Data Content Standard Public Review Draft." Subcommittee on Cultural and Demographic Data, [US] Federal Geographic Data Committee (FGDC). Version 2. April 17, 2003. 41 pages. "Addresses provide a means of locating people, structures and other spatial objects. More specifically, addresses are used to reference and uniquely identify particular points of interest, to access and deliver to specific locations, and as a means for positioning geographic data based on location. Most organizations maintain address lists or have databases or datasets that contain addresses. In many organizations, the primary purpose for creating and maintaining address lists and address information is mail delivery. Organizations often have detailed specifications about the structure of their address information without defining the content, i.e., the elements that constitute an address within their system. Knowledge of both structure and content is required to successfully share information in a digital environment. The purpose of this standard is to facilitate the exchange of address information. The Address Data Content Standard (the Standard) simplifies the address data exchange process by providing a method for documenting the content of address information... The objective of the Standard is to provide a method for documenting the content of address information. As a data usability standard, the Standard describes a way to express the content, applicability, data quality and accuracy of a dataset or data element. The Standard additionally codifies some commonly used discrete units of address information, referred to as descriptive elements. It provides standardized terminology and definitions to alleviate inconsistencies in the use of descriptive elements and to simplify the documentation process. The Standard establishes the requirements for documenting the content of addresses. It is applicable to addresses of entities having a spatial component. The Standard does not apply to addresses of entities lacking a spatial component and specifically excludes electronic addresses, such as e-mail addresses. The Standard is to be used only in the exchange of addresses. The Standard places no requirement on internal organization of use or structure of address data. However, the principles of the Standard can be extended to all addresses, including addresses maintained within an organization, even if they are not shared..." See: (1) details in the announcement "U.S. Federal Geographic Data Committee (FGDC) Draft Address Data Content Standard for Public Review"; (2) "Draft Proposal for a National Spatial Data Infrastructure Standards Project." [source PDF, also in HTML and .DOC format]
Principal references:
- Overview: UPU S42 Standard on International Postal Address Components and Templates. By Joe Lubenow. [source PDF]
- UPU Technical Standards
- UPU International Bureau Address Elements. See excerpt above. [cache]
- UPU postal addressing formats
- Universal Postal Union (UPU) website
- Personal contact: Joe Lubenow
- Institutional contact: Universal Postal Union, International Bureau, Standards Programme, 3000 Berne 15, Switzerland; Tel: + 41 31 350 3111; Fax: + 41 31 350 3110; email: standards@upu.int
- IDEAlliance ADIS website
- "Address Data Interchange Specification (ADIS)." - Main reference page.
- "Markup Languages for Names and Addresses" - Main reference page.