The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Created: March 18, 2005.
News: Cover StoriesPrevious News ItemNext News Item

Fedora Version 2.0 Open-Source Repository Supports XML and Web Services.

Contents

Update 2007-08-14: In August 2007 the Fedora Commons announced the award of a four year, $4.9M grant from the Gordon and Betty Moore Foundation to develop the organizational and technical frameworks necessary to effect revolutionary change in how scientists, scholars, museums, libraries, and educators collaborate to produce, share, and preserve their digital intellectual creations. Fedora Commons is a new non-profit organization that will continue the mission of the Fedora Project, the successful open-source software collaboration between Cornell University and the University of Virginia. The Fedora Project evolved from the Flexible Extensible Digital Object Repository Architecture (Fedora) developed by researchers at Cornell Computing and Information Science. See the proposal to the Moore Foundation and the announcement "Fedora Commons Awarded $4.9M Grant to Develop Open-Source Software for Building Collaborative Information Communities."

[March 18, 2005] The Fedora Project at The University of Virginia Library and Cornell University has announced the release of Fedora (Flexible Extensible Digital Object and Repository Architecture) Version 2.0. This Fedora Project, not to be confused with a Red Hat Inc. initiative with the same name, is a "general purpose repository service devoted to the goal of providing open-source repository software that can serve as the foundation for many types of information management systems. Fedora software demonstrates how distributed digital information management can be deployed using web-based technologies, including XML and web services."

Fedora began with funding from DARPA and the National Science Foundation as a research project of Carl Lagoze and Sandy Payette at Cornell University's Digital Library Research Group in 1997, where the first reference implementation and a CORBA-based technical implementation were built. Major funding for Phase 1 and Phase 2 has been provided by the Andrew W. Mellon Foundation. The project reports that Fedora is in use at some twenty-three (23) universities and research institutes.

New features in Fedora Version 2.0 include "the ability to represent and query relationships among digital objects, a simple XML encoding for Fedora digital objects, enhanced ingest and export interfaces for interoperability with other repository systems, enhanced administrative features, and improved documentation. Fedora is capable of serving as the foundation for many types of information management applications, including institutional repositories, digital libraries, records management systems, archives, and educational software."

The Fedora architecture is "an extensible framework for the storage, management, and dissemination of complex objects and the relationships among them. Fedora accommodates the aggregation of local and distributed content into digital objects and the association of services with objects. This allows an object to have several accessible representations, some of them dynamically produced."

Fedora "expresses relationships by defining a base relationship ontology using RDF Schema (RDFS) and provides a slot in the digital object abstraction for RDF expression of relationships based on this ontology. Assertions from other ontologies may also be included along with the base Fedora relationships. All relationships are reflected in a native RDF triple-store using Kowari. The query interface to this triple-store is exposed as a web service, providing a rich information foundation for external services."

The Fedora architecture is "implemented as a web service, with all aspects of the complex object architecture and related management functions exposed through REST and SOAP interfaces. Fedora digital objects are managed within the Fedora Service Framework which consists of a set of loosely coupled services that interact and collaborate with each other. At the core of the Fedora Service Framework is the Fedora Repository Service which exposes interfaces for managing and accessing digital objects in a repository. Each service interface is defined using the Web Service Description Language (WSDL).

The basic components of a Fedora digital object are: (1) PID, a persistent, unique identifier for the object; (2) Object Properties, a set of system-defined descriptive properties that are necessary to manage and track the object in the repository; (3) Datastream(s), the component in a Fedora object that represents MIME-typed content item, where an object can have one or more Datastreams and the content of a Datastream can be either data or metadata, and this content can either be stored internally in the Fedora repository, or stored remotely; (4) Disseminator(s), the component in a Fedora object that associates an external service with the object for the purpose of providing extensible views of the object or of its datastream content."

From an implementation perspective, "Fedora digital objects can be serialized and stored as XML. The Fedora object model is directly expressed using XML Schema language in a format known as Fedora Object XML (FOXML). FOXML defines a digitalObject root element that contains as set of objectProperties, one or more datastream components, and one or more disseminator components. Although FOXML is the preferred XML serialization format for storing objects in a Fedora repository, Fedora supports ingest and export of digital objects in other XML formats. Currently, the system supports a Fedora profile of the Metadata Encoding and Transmission Format (METS) and it will soon support the OAI-PMH harvesting of objects encoded in MPEG21 Digital Object Description Language (DIDL)."

The Fedora object model "defines several metadata entities that pertain to managing the integrity of digital objects. These entities are the object's relationship metadata, access control policy, and audit trail. Integrity entities are modeled as datastream components with reserved identifiers. As such, the integrity entities are stored like other datastreams; however the Fedora Repository system recognizes them as special and asserts constraints over how they are created and modified. A Policy datastream is used to express authorization policies for digital objects, both to protect the integrity of an object and to enable fine-grained access controls on an object's content. In Fedora objects, a policy is expressed using the Extensible Access Control Markup Language (XACML), a flexible XML-based language used to assert statements about who can do what with an object, and when they can do it. Object policies are enforced by the authorization module (i.e., AuthZ) implemented within the Fedora Repository Service." [Note 1]

Fedora digital objects may be related to other Fedora objects in several ways: "for example a Fedora object may represent a collection and other objects that are members of that collection. Also, it may be the case that one object is considered a part of another object, a derivation of another object, a description of another object, or even equivalent to another object. Object-to-Object relationship metadata is a way of asserting these various kinds of relationships for Fedora objects. A default set of common relationships is defined in the Fedora relationship ontology [fedora-relsext-ontology.rdfs], although other community or user-defined relationships may also be asserted. Fedora object-to-object relationship metadata is the basis for enabling advanced access and management functionality driven from metadata that is managed within the repository. Example uses of relationship metadata include: (1) organizing objects into collections to support management, OAI harvesting, and user search/browse; (2) defining bibliographic relationships among objects such as those defined in the IFLA Study Group FRBR; (3) defining semantic relationships among resources to record how objects relate to some external taxonomy or set of standards; (4) modeling a network overlay where resources are linked together based on contextual information; (5) encoding natural hierarchies of objects; (6) make cross-collection linkages among objects, for example, to show that a particular document in one collection can also be considered part another collection."

The Fedora Repository System is free open-source software distributed under the Mozilla Public License Version 1.1. It requires the Sun Java Software Development Kit, version 1.4 or higher.

Trademark Note: there is also a Fedora Project of Red Hat, Inc.

About Fedora Object XML (FOXML)

FOXML is "a simple XML format that directly expresses the Fedora digital object model. At the highest level, the FOXML XML schema defines elements that correspond directly to the fundamental Fedora digital object components. As of Fedora 2.0, digital objects will be stored internally in a Fedora repository in the FOXML format. In addition, FOXML can be used for ingesting and exporting objects to/from Fedora repositories. The Fedora extension of METS continues to be supported as an ingest and export format. In upcoming releases, Fedora will also support other formats for ingest and export such as METS 1.4 and MPEG21/DIDL.

The introduction of FOXML was motivated by several requirements: (1) simplicity, (2) optimization and performance, and (3) flexibility in evolving Fedora. Regarding simplicity, user feedback called for a conceptually easy mapping of the Fedora concepts to an XML format. Users wanted an obvious sense of how to create Fedora ingest files, especially those who are not familiar with formats such as METS and MPEG21/DIDL.

Regarding optimization and performance, the FOXML schema was designed to improve repository performance, both at ingest and during disseminations. Overall ingest performance has been positively affected with FOXML, especially in the validation phases. Regarding flexibility, establishing FOXML as the internal storage format for Fedora objects enables easier evolution of functionality in the Fedora repository, without requiring ongoing extensions to other community formats.

An offical published version of the FOXML XML schema is also published on the Fedora web site. Also, a copy of the schema is provided with the Fedora open-source distribution. The Fedora repository service validates all Fedora objects against this schema before objects are permanently stored in the repository...

The Fedora repository service is designed to be able to accommodate different XML formats for encoding digital objects through its ingest and export operations, available via the Fedora management service interface (API-M). Currently, Fedora supports ingest and export of the new FOXML format as well as the Fedora extension of the Metadata Encoding and Transmission Standard (METS). Prior to Fedora 2.0, the METS extension format was the only XML format supported by Fedora. As of Fedora 2.0, the system is moving in a direction where FOXML will be the preferred internal storage format of Fedora objects, but the repository will accept objects encoded in other formats and will export objects encoded in other formats..." [adapted/excerpted]

Fedora Repository Web Service Interfaces

"The interface to the system consists of three open APIs that are exposed as web services:

  • Management API (API-M) — defines an interface for administering the repository. It includes operations necessary for clients to create and maintain digital objects and their components. API-M is implemented as a SOAP-enabled web service.
  • Access API (API-A) — defines an interface for accessing digital objects stored in the repository. It includes operations necessary for clients to perform disseminations on objects in the repository and to discover information about an object using object reflection. API-A is implemented as a SOAP-enabled web service.
  • Access-Lite API (API-A-Lite) — defines a light-weight version of the Fedora Access Service that is implemented as a REST-based web service that can be invoked with a simple URL syntax.
  • Management-Lite API (API-M-Lite) — this interface is under development. The intent is to provide a light-weight version of the Fedora Management Service implemented as a REST-based web service that can be invoked with a simple URL syntax. Currently, there is only one operation implemented, which is the getNextPID. A full versioin of this interface may be provided in a future release, depending on user demand.
  • Search API (part of API-A-Lite) — this interface provides a simple search of the repository. The search operates upon the default index for the repository which contains object properties and the default Dublin Core record for the repository. This search interface is intended as a basic field search of the object registry. It is expected that other services will be used to created to index Fedora objects at a finer-grained level than provided by the default search.
  • Resource Index Search API — this is a new interface as of Fedora 2.0 that provides searching of the new Resource Index. The Resource Index is an RDF-based index of the Fedora repository that includes the following for each digital object: object properties; object-to-object relationships; metadata about datastreams and disseminations; default Dublin Core record..." [adapted from the Features List]

The Fedora Repository: Major Features

Key features of the Fedora Release 2.0 Repository:

  • Open Source — The Fedora repository system is open source software licensed under the Mozilla Public License.
  • Flexible Digital Object Model — The Fedora digital object model provides the flexibility to create kinds of objects including documents, images, electronic books, multi-media learning objects, datasets, metadata, and more. The model a supports the aggregation of one or more content items as "Datastreams". The bytestream content of a Datastream can be any media type and can be either stored locally in the repository, or referenced by a digital object (i.e., content stored outside the repository. The Fedora object model also provides a mechanism (known as a Disseminator) for associating services with an object to produce dynamic or computed content from digital objects.
  • Content Versioning — The Fedora content versioning system is enabled with the release of Fedora 1.2. Any modifications made to a Datastream or Disseminator through the Fedora management interface (API-M) will automatically result in the creation of a new version of that Datastream or Disseminator. The Fedora object contains a record of alll versions, thereby creating a history of how objects changed over time. Additionally, Fedora maintains an audit trail record of the nature of the object change events.
  • XML Ingest and Export — Digital objects can be submitted to a Fedora repository as XML-encoded files that conform to either the Fedora Object XML (FOXML) schema or an extension of the Metadata Encoding and Transmission Standard (METS) schema.
  • XML Storage — By default, Fedora digital objects are stored in a Fedora repository as XML-encoded files that conform to Fedora Object XML (FOXML) format. Content bytestreams that are aggregated as Datastreams in a digital object are stored in their native formats in the repository persistent storage area.
  • Object-to-Object Relationships — Object-to-Object relationship metadata is a way of asserting various kinds of relationships among Fedora objects including the notion that an object is a member of a collection, is considered a part of another object, is a derivation of another object, is a description of another object, or is equivalent to another object.
  • Access Control and Authentication — Release 1.2 included a simple form of access control to provide access restrictions based on IP address. IP range restriction is supported in both the Management and Access APIs. In addition, the Management API is protected by HTTP Basic Authentication. [The V2.1 Winter/Spring 2005] release introduces a new Access Control and Authentication module that includes the ability to enforce fine-grained access control policies expressed using XACML [RefPage].
  • Simple Search — Fedora automatically creates two indexes of the repository. The default search index is a simple index of the repository searchable by object properties and object Dublin Core elements. As of Fedroa 2.0 there is the new RDF-based Resource Index which includes more information about objects plus object-to-object relationships. Both of these indexes are searchable via REST-based web service interfaces.
  • RDF-based Resource Index — As of Fedroa 2.0 there is the new RDF-based Resource Index which includes more information about objects plus object-to-object relationships. Both of these indexes are searchable via REST-based web service interfaces.
  • OAI Metadata Harvesting Provider — The OAI Protocol for Metadata Harvesting is a standard for sharing metadata across repositories. Every Fedora digital object has a primary Dublin Core record that conforms to the XML schema, [making] metadata accessible using the OAI Protocol for Metadata Harvesting, v2.0.
  • Migration Utility — A new migration utility is provided to perform mass export and mass ingest of objects. At the core, the migration utility is built upon two newly enhanced command-line functions: fedora-export and fedora-ingest. Used together, these two command line functions can support a variety of scenarios involving moving or copying objects between repositories.
  • Batch Utility — The Fedora repository system includes a Batch Utility as part of the Fedora Administrator client that enables the mass creation and modification of Fedora digital objects.
  • Reporting Utility — A reporting utility is provided providing different management views of the contents of the Fedora repository..." [adapted from the summary]

Fedora 2.0 Announcement

Cornell University and the University of Virginia announced the release of Fedora 2.0 "A Powerful Open-Source Solution for Digital Repositories" on February 24, 2005:

"...This release represents a significant increase in features and functionality over previous releases. New features include the ability to represent and query relationships among digital objects, a simple XML encoding for Fedora digital objects, enhanced ingest and export interfaces for interoperability with other repository systems, enhanced administrative features, and improved documentation. More than ever, Fedora is capable of serving as the foundation for many types of information management applications, including institutional repositories, digital libraries, records management systems, archives, and educational software."

"As with prior versions of the software, all Fedora functionality is exposed through web service interfaces. At the core of this functionality is the Fedora object model that enables the aggregation of multiple content items into digital objects. This allows objects to have several accessible "representations." For example, a digital object can represent an electronic document in multiple formats, a digital image with its descriptive metadata, or a complex science publication containing text, data, and video. Services can be associated with digital objects, allowing dynamically-produced views, or "virtual representations" of the objects. Historical views of digital objects are preserved through a powerful content versioning system."

"The new Fedora 2.0 introduces the 'Resource Index' which is a module that allows a Fedora repository to be viewed as a graph of inter-related objects. Using the Resource Description Framework (RDF), relationships among objects can be declared, and queries against these relationships are supported by an RDF-based triple store. Fedora 2.0 also introduces 'Fedora Object XML' (FOXML) which is a simple XML format for encoding Fedora digital objects. To support multiple XML standards, Fedora's ingest/export interface has been enhanced, permitting digital objects to be encoded in different formats. Currently, there is support for METS and FOXML. In future releases other XML formats will be supported, including MPEG21-DIDL. Other new features include a mass-update utility for modifying objects, a new administrative reporting interface, improved documentation, and tutorials."

"The Fedora open-source software is jointly developed by Cornell University and the University of Virginia with generous funding from the Andrew W. Mellon Foundation. Fedora 2.0 marks the final milestone in Phase I, a three year project to develop the core Fedora Repository system. Now underway, Fedora Phase II is a three year development project that will focus on advanced features including workflow, digital preservation, policy enforcement, information networks, and federated repositories."

Use of Fedora

Institutions reporting use of Fedora:

  • American Geophysical Union
  • Amrita Vishwa Vidyapeetham
  • Cornell University: Cornell Information Technologies
  • University of Delaware
  • Entidad Pública Empresarial Red.es
  • Hamilton College
  • Indiana University: Digital Library Program
  • Library of Congress: I Hear America Singing Site
  • Monash University
  • National Library of Australia
  • National Library of Wales
  • New York University: The Humanities Computing Group
  • Northwestern University: A Library/Academic Computing Team
  • Octagon Data Systems
  • Rutgers University: Library
  • Swinburne University of Technology
  • Tibetan Buddhist Resource Center
  • Tufts University: The Digital Collections and Archives Department
  • University of New South Wales
  • University of Virginia: Digital Library
  • WebOPAC Application Pvt. Ltd.
  • Yale University: Electronic Records Archive
  • VTLS, Inc. [deployment page]

Principal References


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Bottom Globe Image

Document URI: http://xml.coverpages.org/ni2005-03-18-a.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org