[February 03, 2001] A research report from René van Horik (Researcher and Project Manager, Netherlands Institute for Scientific Information Services - NIWI) describes the use of XML in the European Visual Archive (EVA) The European Visual Archive is a searchable image resource containing thousands of historical photographs from the collections of the London Metropolitan Archives and the Stadsarchief Antwerpen. The EVA project partners are: Antwerp City Archives (Belgium), Telepolis (Belgium), London Metropolitan Archives (UK), Netherlands Institute for Scientific Information Services (the Netherlands), SailLabs GmbH (Germany) and the European Commission on Preservation and Access (the Netherlands). The development of the EVA-system is part of Workpackage 6 of the EVA (European Visual Archive) Project. EVA is a project for the Info2000 initiative launched by DG XIII of the European Commission, responsible for telecommunications and the information market. According to the report, The EVA Project project aims to investigate relevant issues to enhancing access to historical photographic collections. These issues include: copyright issues, selection procedures, user surveys, digitization techniques, description standards, pricing policy and digital information management systems. Based on the outcomes of this research a Web-based information system is being developed called the EVA system. This system contains descriptions and digital images that belong to the photographic holdings of two City archives: the London Metropolitan Archives and the City archives of Antwerp. The EVA project has two main audiences: Image producers and image consumers. Based on the outcomes of the project an archive will be able to digitize and document its photographs in a well thought-out way. The low threshold for collections to join the EVA system provides them with a tool to get in contact with a huge potential of image consumers... For the implementation of the data exchange between the local archive information systems and the central EVA system the project decided to use the XML standard. This is an application independent data structure. For each description of a photograph a separate XML file is created. An XML document contains special instructions called tags, which usually enclose identifiable parts of the document. The elements that are allowed are specified in a DTD (Document Type Definition). The DTD used by the EVA system is called EVOlite DTD [Appendix B]. In this way self-describing documentation units are created. The creation of the XML files in principle is the responsibility of the archives. Within the project software and procedures were developed to assist them in the creation of output in XML format. In the future probably more and more information systems will facilitate the creation of data in XML format and it will become easier to manage data consistency between a local archive management system and the Web-based access system. Just like with the images the XML files are sent via FTP to the server of the EVA system. The archives can independently add, change and delete descriptions. The EVA system aims at two types of usage: End-users interested in access to a catalogue of images and descriptions, and users interested in the results of the EVA project and the model of the EVA system. Based on the information on the Web site an archive employee should be able to evaluate the relevance of the project results for the conversion and dissemination of its own collection. The XML formatted descriptions are automatically converted to the database on which the EVA system is based. Periodically the database is refreshed with new information that is sent to the server by the archives with the help of the FTP protocol." Appendix D of the EVA design specification describes 'Multilingual Query Processing' features. In this connection, the interchange format used is OLIF (Open Lexicon Interchange Format); "OLIF is based on a XML/SGML type notation. Each OLIF exchange file has a XML/SGML header containing basically global definitions of features and values used in the OLIF document type definition and a body containing the entries."
"The EVA-system uses 10 of the 15 Dublin Core elements: Title, Description, Creator, Date, Relation, Coverage, Language, Subject, Identifier, Publisher. The following DC elements are not used: Contributor, Source, Type, Format and Rights Management. It is the intention of the EVA project to develop an EVO.DTD that covers all attributes of all objects that are relevant for the dissemination of historical photographic collections. The EVA-system is based on the EVOLite.DTD (see appendix 7.2). This DTD contains elements that can be mapped with Dublin Core..." [from design document Appendix E]
Appendix D of the EVA design specification describes 'Multilingual Query Processing' features. "One of the intentions of EVA is to let European citizens participate in the cultural heritage of other countries, multilinguality being one of the obstacles to overcome. Requirements of multilinguality, however, need to be analysed in detail: Users should be able to communicate with the system in their native language. This is a requirements to user interfaces and communication. EVA will address this by providing access in at least five European languages, to demonstrate the possibility to extend to even more. The user can input his query in several languages. The EVA prototype will process queries in Dutch, German and English. The query will be matched by the query translation tool with term translations of the processed languages and with a prototypical set of concept relations which are defined by a pro-totypical thesaurus... [system support is] provided by a QTE (Query Translation and Expansion) service. For exchange: "The lexicon needs to be connected with supplementary lexical resources for updates or for adding further languages. For this purpose a special lexicon interchange format is used: OLIF (Open Lexicon Interchange Format).. The first version of OLIF was defined in the framework of the EC OTELO project. An OLIF standard will be defined in the framework of the OLIF consortium (members are e.g. Microsoft, SAP, Logos Sail Labs, Lotus, IBM). The lexicon communicates with external data sources by reading entries in the OLIF format and distributes entry features and values from an OLIF input file into the respective internal tables (= import functionality). OLIF is based on a XML/SGML type notation. Each OLIF exchange file has a XML/SGML header containing basically global definitions of features and values used in the OLIF document type definition and a body containing the entries. Basic items are features and values. Features are defined as elements, with a start and an end item..." (see page 29)
References:
EVOlite DTD - DTD and sample instances in XML. From the design document.
System design for the EVA-System. Workpackage 6 EVA-project ('Working Model'). Project code: PUB 1128 EVA 25001 (D 6.1 and D 6.2 Functional and technical design). "The purpose of this document is to describe the functional and architectural design of the EVA-system. In order to achieve a design a problem analysis has been performed. The document contains the result of the internal discussions about the functionality needed in the EVA system. NIWI acts as the editor of this document... As a starting point the content of the EVA system will be a collection of 20.000 photographs of the city archives of Antwerp (SAA) and London (LMA). Both SAA and LMA are considered as representative for most European main city archives. Local traditions and perspectives are respected by trying to establish an approach that will fit in both infrastructures..." [cache]
[February 03, 2001] "Archives and Photographs: the 'European Visual Archive' Project (EVA)." By René van Horik (Researcher and Project Manager, Netherlands Institute for Scientific Information Services - NIWI). In Cultivate Interactive Issue 3 (January 29, 2001). ['An article on the EVA project which details how they used Dublin Core for their description elements and XML for data exchange.'] "The EVA Project project aims to investigate relevant issues to enhancing access to historical photographic collections. These issues include: copyright issues, selection procedures, user surveys, digitization techniques, description standards, pricing policy and digital information management systems. Based on the outcomes of this research a Web-based information system is being developed: the EVA system. This system contains descriptions and digital images that belong to the photographic holdings of two City archives: the London Metropolitan Archives and the City archives of Antwerp. The EVA project has two main audiences: Image producers and image consumers. Based on the outcomes of the project an archive will be able to digitize and document its photographs in a well thought-out way. The low threshold for collections to join the EVA system provides them with a tool to get in contact with a huge potential of image consumers. These users can search the image descriptions, view reference images and order images for specific use. The purpose of this article is to report on the main outcomes of the studies carried out within the framework of the project and to describe the starting points on which the EVA system is based... Based on the results of the preparatory studies the content providers of the project, the City archives of Antwerp and London Metropolitan Archives, each started the process of selecting photographs by creating 10.000 digital master files. These digital images had to be 'rich' enough to serve as the basis for derivative images that are published online in the EVA system... Digitizing historical photographs is more than just putting photographic prints on a scanner. A lot of information associated with the creation of digital images is relevant for (future) use, access, update and maintenance of the images and the relation with the original prints. This information (or data) about data is called metadata. It turned out to be that within the 'universe of discourse' of the EVA project several metadata schemes are of potential importance. This is because roughly speaking the EVA project is covers three related 'things': firstly, the historical photograph as a physical medium, secondly, the digital surrogate that is based on the photograph and thirdly, that what is visible on the photograph and the processed digital image. For the sake of abstraction these three 'things' together (the photograph format, the digital image and the visible scene or content) are called an 'EVA visual object', abbreviated as EVO... For the implementation of the data exchange between the local archive information systems and the central EVA system the project decided to use the XML standard. This is an application independent data structure. For each description of a photograph a separate XML file is created. An XML document contains special instructions called tags, which usually enclose identifiable parts of the document. The elements that are allowed are specified in a DTD (Document Type Definition). The DTD used by the EVA system is called EVOlite DTD. In this way self-describing documentation units are created. Two examples of XML formatted descriptions [are provided]. The creation of the XML files in principle is the responsibility of the archives. Within the project software and procedures were developed to assist them in the creation of output in XML format. In the future probably more and more information systems will facilitate the creation of data in XML format and it will become easier to manage data consistency between a local archive management system and the Web-based access system. Just like with the images the XML files are sent via FTP to the server of the EVA system. The archives can independently add, change and delete descriptions... Individual archives create XML files, extracted from the local archive information system. The XML files are sent by the standard Internet protocol FTP to the server of the EVA system. Archives can add, delete and replace XML files independently... The EVA system aims at two types of usage: End-users interested in access to a catalogue of images and descriptions, and users interested in the results of the EVA project and the model of the EVA system. Based on the information on the Web site an archive employee should be able to evaluate the relevance of the project results for the conversion and dissemination of its own collection. The XML formatted descriptions are automatically converted to the database on which the EVA system is based. Periodically the database is refreshed with new information that is sent to the server by the archives with the help of the FTP protocol. The interface between the database and the end-user consists of several Web pages. The input fields are based on the database that contains information that originates from the XML formatted files provided by the archives."
Contact: eva-info@eva-eu.org