The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Created: November 02, 2007.
News: Cover StoriesPrevious News ItemNext News Item

SNIA Demonstrates Extensible Access Method (XAM) Interoperability.

Contents

The Storage Networking Industry Association (SNIA) recently announced successful interoperability demonstrations of the Extensible Access Method (XAM) specification at the Storage Networking World Solutions Center. Four distinct information management applications based on the XAM specification are provided by EMC, HP, Sun Microsystems, and Vignette. The demonstration illustrates XAM's ability to protect end user information from technology lock-in by decoupling storage systems from data applications.

The three-part Extensible Access Method (XAM) specification addresses the problem of preserving and managing reference information, also called fixed content. Fixed content, as distinct from transactional content, "consists of data such as digital images, e-mail messages, presentations, video content, medical images and check images that don't change over time. Unlike transaction-based data, whose usefulness is short, fixed content data must be kept for long periods of time, often to comply with retention periods and provisions that government regulations such as the Sarbanes-Oxley Act of 2002 have specified [NetwordWorld]. According to several estimates, most data born digital is now fixed content, and is rapidly gaining prominence over transactional data.

The XAM (Extensible Access Method) Interface specification "defines a standard access method (API) between Consumers (application and management software) and Providers (storage systems) to manage fixed content reference information storage services. XAM includes metadata definitions to accompany data to achieve application interoperability, storage transparency, and automation for ILM-based practices, long term records retention, and information security. XAM will be expanded over time to include other data types as well as support additional implementations based on the XAM API to XAM conformant storage systems."

"The SNIA XAM standard access method is designed to benefit storage vendors, software developers and the end user community. It provides: (1) Interoperability: applications can work with any XAM conformant storage system; information can be migrated and shared; (2) Compliance: integrated record retention and disposition metadata; (3) ILM Practices: a framework for classification, policy, and implementation; (4) Migration: the ability to automate migration process to maintain long-term readability; (5) Discovery: application-independent structured discovery avoids application obsolescence."

Two parts of the XAM specification are available for public review, both at Version 0.64 as of 2007-10. The XAM Architecture Specification Working Draft, produced by members of the SNIA Fixed Content Aware Storage Technical Working Group (FCAS TWG), defines the architecture of the XAM API and Vendor Interface Module (VIM) API. Among the core features:

  • Reference information is associated with a globally unique name. By binding reference information to a unique name, an application can efficiently manage the reference information without concern for the data's actual location.
  • Metadata is raised to the same level of importance as the reference information itself. By bundling together data and metadata (contextual data about the information being stored), applications can more easily manage and share reference information, which facilitates ILM.
  • Storage systems are accessed via a standard, pluggable architecture. The XAM architecture is a software framework that allows XAM-enabled applications to interface with XAM-compliant vendor devices in order to store and retrieve reference information in a vendor-independent and location-independent manner.
  • A standard XAM storage provider interface is supported. XAM Storage System vendors can plug their systems into the XAM API by creating a provider for the Vendor Interface Module API (VIM API). XAM also provides a standardized set of management disciplines and semantics for fixed content, such as retention, expiration, etc.

The XAM C API Specification Working Draft forms part of the XAM Software Development Kit (SDK). It is a complete reference document for C application development using the XAM API. It is intended for experienced programmers, for those developing applications that interface with storage systems that support the XAM API, and for those developing components of the XAM Library itself. The XAM SDK is a software implementation of the XAM SDK, including a library, a reference implementation and tools to speed the implementation and adoption of the XAM API. A XAM Java API Specification is planned as well.

Both specifications were produced by members of SNIA's Fixed Content Aware Storage Technical Working Group (FCAS TWG). The SNIA Fixed Content Aware Storage TWG was chartered to serve as a center of technical activities related to new application-level interfaces for storage of unchanging data (fixed content) and associated metadata based on a variety of naming schemas including Content Addressed Storage (CAS) and global content-independent identifiers.

XAM has three primary objects: the XSet, the XSystem, and the XAM Library. An XSet is the unit of data that an application can commit to persistent storage within XAM. An XSet is the addressable unit of storage in the XAM architecture from the application's perspective. For an application to store data in an XSystem, the application must create an empty XSet, populate the XSet fields with its data, and then commit the XSet to persistent storage. If the commit is successful, the application is given a name for the XSet, called a XUID. The application can use the XUID to access the data it stored, exchange the XUID with another application so that it can retrieve the XSet, use it to create application-specific relationships between XUIDs, or use it for other purposes. An 'XSystem Resource Identifier' is used to specify a target XSystem that can include optional parameters that are useful when connecting to an actual XAM Storage System. The syntax is similar to an Internationalized Resource Identifier (IRI), with field definitions specific to XAM.

An XSystem is a logical container of one or more XSets. An XSystem may provide additional capabilities for data storage management, which may ultimately influence XSet data access and data management. The XAM Library enables an application to discover and communicate with multiple XAM Storage Systems. It allows applications to create and manage XAM Sessions, to connect to and manipulate XSystems. Besides these capabilities, the XAM Library also presents a number of fields (properties or XStreams), which are pertinent to the XAM Library, and describe its capabilities, configuration, and other characteristics.

XAM defines an XSet Canonical Format in order to support primary requirements for interoperability and performance. Interoperability is required so that XSets can be moved between different XSystems without a loss of data; good performance is required to enable XSystems to efficiently export or import large numbers of XSets in a reasonable time. The canonical XSet format is packaged in two main parts: an XML document describing the properties and streams of the XSet, or XSet manifest, and the binary representation of the streams. Since properties are compact, they are fully defined in the XML document. Since XStreams can be rather large, only the attributes of the stream shall be included in the XML document; the actual contents shall be outside the XML document in a separate part of the package. The format of the package adheres to the XML-binary Optimized Packaging (XOP) format.

SNIA intends to advance the XAM Specification to become a SNIA Architecture Standard, then an ANSI standard, and then an ISO standard. The SNIA will advance the XAM SDK beyond use of the SNIA membership, to be available to the public-at-large.

Bibliographic Information

  • XAM Architecture Specification. XAM Arch-S Working Draft. Version 0.64. August 6, 2007. 156 pages. Produced by members of the SNIA Fixed Content Aware Storage Technical Working Group (FCAS TWG). Copyright © 2007 Storage Networking Industry Association. Appendix B describes the XSD (XML Schema Definition) for the XML manifest used by the XSet canonical format when importing and exporting an XSet. NS URI: http://www.snia.org/2006/xam/export.

    Status: Publication of this Working Draft for review and comment has been approved by the FCAS Technical Working Group. This draft represents a best effort attempt by the FCAS Technical Working Group to reach preliminary consensus, and it may be updated, replaced, or made obsolete at any time. This document should not be used as reference material or cited as other than a 'work in progress.' Suggestion for revision should be directed to fcastwg@snia.org.

    Purpose and Audience: This document is intended to be used by two broad audiences. The first audience is individuals and companies that are application programmers that wish to use the XAM Application Programmers Interface (API) to create, access, manage, and query reference content through standardized methods that are collectively referred to as XAM (Extensible Access Method). The second audience is individuals and companies that implement reference content stores that wish to provide access to their stores through the XAM standardized methods.

  • XAM C API Specification. XAM C API-S Working Draft. Version 0.64. July 26, 2007. 202 pages. Produced by members of the SNIA Fixed Content Aware Storage Technical Working Group (FCAS TWG). Copyright © 2007 Storage Networking Industry Association.

    Publication of this Working Draft for review and comment has been approved by the FCAS Technical Working Group. This draft represents a best effort attempt by the FCAS Technical Working Group to reach preliminary consensus, and it may be updated, replaced, or made obsolete at any time. This document should not be used as reference material or cited as other than a 'work in progress.' Suggestion for revision should be directed to fcastwg@snia.org.

    Purpose and Audience: This document forms part of the XAM Software Development Kit (SDK). It is a complete reference document for C application development using the XAM API. It is intended for experienced programmers, for those developing applications that interface with storage systems that support the XAM API, and for those developing components of the XAM Library itself. For an overview of the SNIA XAM, refer to the Business Overview chapter in the XAM Architecture Specification.

    This document does not normatively specify the semantics of the interfaces: the specification of the semantics in the XAM standard is contained in the XAM Architecture Specification. Any semantics described in this document are intended to be informative and to simplify the understanding of the interfaces described herein.

From the Announcement

Excerpted from the announcement: "SNIA Demonstrates First Multi-Vendor Fixed-Content Solution Based on XAM. SNW Fall Demonstration by EMC, HP, Sun, and Vignette Highlights Strength of XAM Specification and Value of Standardized Access Method for Fixed Content."

The Storage Networking Industry Association (SNIA) today announced that it is utilizing the Extensible Access Method (XAM) specification for the first time in a demonstration in the Storage Networking World Solutions Center. Four distinct information management applications based on the XAM specification are provided by EMC, HP, Sun Microsystems, and Vignette. The demonstration illustrates XAM's ability to protect end user information from technology lock-in by decoupling storage systems from data applications. XAM delivers unmatched interoperability and storage transparency to end users, along with the ability to assure information immutability, to meet long-term digital information retention requirements. The demonstrated storage platforms are provided by EMC, HP, and Sun, highlighting the interoperability and ease of deployment of XAM.

"This demonstration is not possible without the standard interface to fixed content that XAM provides. As increases occur in the number of solutions, applications, and end users focused on managing and retaining the ever-expanding volume of digital information for the long term, the need for a standard to help deliver information based upon Information Lifecycle Management best practices continues to increase," said Christina Casten, Co-Chair of the SNIA XAM Initiative.

The XAM Initiative and associated specification development efforts continue to progress and build momentum. More than forty-five (45) companies, which include software providers, storage vendors, and application developers, are participating in the initiative and its two associated Technical Work Groups. XAM development efforts are on-track as the Association plans to deliver the specification in 2008 followed by its submission for ANSI and ISO accreditation. In parallel with the specification work is the development of the XAM Software Development Kit, which will be licensed to the industry.

"We are very excited to be able to demonstrate the power of XAM and show that all of our collaborative hard work is delivering a specification that provides tangible value to end users, vendors, and application developers," said Vincent Franceschini, SNIA's 2007 Chairman. "In our first public demonstration at Storage Networking World this week, we are able to show a records and management application, a file archiving system, a database archiving solution, and a fourth custom imaging application, all utilizing the current version 0.6 XAM Application Programming Interface to store, retrieve, query, and manage data across disparate fixed content storage systems."

The XAM specification is aimed at delivering value to three audiences: Independant Software Vendors (ISVs), storage end users, and storage vendors. XAM as a standard interface provides a framework for the coordination of information and content metadata among applications and storage systems. The specification will help simplify the porting and testing procedures for ISVs and broaden the market for storage vendors. For storage users looking to automate their storage migration and align their efforts with compliance requirements, the XAM specification will help reduce vendor lock-in to long-term digital information retention solutions by decoupling the back-end storage from the application. The specification is also aimed at alleviating some of the challenges associated with migrating data from different storage devices within an electronic archive, which typically occurs every three to five years.

For more information on the XAM Initiative and specification, please visit:

     www.snia.org/xam

About the SNIA: The Storage Networking Industry Association (SNIA) is a not-for-profit global organization, made up of some 400 member companies and 7,000 individuals spanning virtually the entire storage industry. SNIA's mission is to lead the storage industry worldwide in developing and promoting standards, technologies, and educational services to empower organizations in the management of information. To this end, the SNIA is uniquely committed to delivering standards, education, and services that will propel open storage networking solutions into the broader market. For additional information, visit the SNIA web site at www.snia.org.

About the XAM Initiative: The SNIA XAM Initiative's mission is to drive adoption of its forthcoming Extensible Access Method (XAM) specification. This Initiative will build and serve a XAM community that includes storage vendors, independent software vendors, and end users to ensure that the specification fulfills market needs for a fixed content data management interface standard. These needs include interoperability, information assurance (security), storage transparency, long-term records retention and automation for Information Lifecycle Management (ILM)-based practices. To learn more, visit www.snia.org/xam.

XAM: Business and Technical Overview

Extracts from the XAM Architecture Specification:

SNIA XAM Background: The amount of reference Information (also known as fixed content) has been growing rapidly each year. At the same time, business demand for timely access to that data, in both the private and public sectors, has been growing. Beyond timely access to this data, businesses need a way to relocate data across diverse hardware platforms, without compromising data integrity.

Current products for storing and managing reference information have significant limits when integrating with other storage products and applications. Vendor-specific data access and data management methods (e.g., for naming, retention, and deletion) are common, requiring application software modifications, sometimes extensively, to integrate with each storage product. These integration obstacles also limit the ability to share reference information among applications, and no standards exist for moving reference information across different storage products.

To meet these challenges, the storage industry requires a set of standard interfaces to enable more functional and sophisticated products. These interfaces need to allow multiple vendors to provide different classes of hardware and software products that store, retrieve, and manage reference information reliably and seamlessly.

The XAM Approach: XAM provides an application programming interface (XAM-API) that allows XAM applications to store data in a fashion that does not depend on the specific storage system. XAM provides the following important functionality to applications and storage systems:

  • Reference information is associated with a globally unique name. By binding reference information to a unique name, an application can efficiently manage the reference information without concern for the data's actual location. Location independence provides a mechanism for implementing Information Lifecycle Management (ILM) practices within a XAM-based storage system itself.

  • Metadata is raised to the same level of importance as the reference information itself. By bundling together data and metadata (contextual data about the information being stored), applications can more easily manage and share reference information, which facilitates ILM.

  • Storage systems are accessed via a standard, pluggable architecture. By standardizing the architecture, customers can add and remove storage products without impacting applications. The XAM architecture is a software framework that allows XAM-enabled applications to interface with XAM-compliant vendor devices in order to store and retrieve reference information in a vendorindependent and location-independent manner.

  • A standard XAM storage provider interface. XAM Storage System vendors can plug their systems into the XAM API by creating a provider for the Vendor Interface Module API (VIM API). XAM also provides a standardized set of management disciplines and semantics for fixed content, such as retention, expiration, etc.

The XAM Architecture: The XAM architecture is a software framework that allows XAM-enabled applications to interface with XAM-compliant vendor devices. The goal of this architecture is to allow applications to take advantage of the XAM API (Application Programming Interface) to store and retrieve reference information in a vendor-independent and location-independent manner.

A primary requirement of the architecture is the ability to support access to multiple vendors' XAM Storage Systems and multiple versions of the same vendor's XAM Storage System. That is, different versions of the XAM specification must be able to access the same XAM Storage System, or, the same version of the XAM specification must be able to access different versions of a XAM Storage System. The architecture also allows multiple applications to access the same XAM Storage System.

The XAM architecture provides a mechanism for XAM Storage System vendors to create Vendor Interface Modules (VIMs) that act as bridges between the XAM standard APIs and the vendor's XAM Storage Systems. How the VIMs connect to their respective devices (for example, TCP/IP, SCSI, or a file system) is transparent to the XAM standard API and the application. The connection is completely encapsulated by the VIM; the applications should be unaware of the VIM's existence and functionality.

The XAM Approach, Object Model: three software modules (Toolkit, XAM, and VIM APIs) are defined within the XAM architecture. The XAM architecture uses these software modules to create a logical view of the XAM Storage System. This logical view defines a set of objects that are arranged hierarchically, providing a consistent abstraction that is independent of a variety of implementation approaches.

XAM has three primary objects: the XSet, the XSystem, and the XAM Library:

  • An XSet is the unit of data that an application can commit to persistent storage within XAM... An XSet is the addressable unit of storage in the XAM architecture from the application's perspective. For an application to store data in an XSystem, the application must create an empty XSet, populate the XSet fields with its data, and then commit the XSet to persistent storage. If the commit is successful, the application is given a name for the XSet, called a XUID. The application can use the XUID to access the data it stored, exchange the XUID with another application so that it can retrieve the XSet, use it to create application-specific relationships between XUIDs, or use it for other purposes.

  • An XSystem is a logical container of one or more XSets. An XSystem may provide additional capabilities for data storage management, which may ultimately influence XSet data access and data management. These capabilities, such as resource, security, migration, virtualization, resiliency, and performance, are outside the scope of XAM. XAM accommodates these XSystem capabilities by providing an XSet abstraction that obligates the XSystem to the mutually agreed-to data storage management behavior and rules.

  • The XAM Library enables an application to discover and communicate with multiple XAM Storage Systems. The XAM Library allows applications to create and manage XAM Sessions, to connect to and manipulate XSystems. Besides these capabilities, the XAM Library also presents a number of fields (properties or XStreams), which are pertinent to the XAM Library, and describe its capabilities, configuration, and other characteristics... An XSystem may map to a single storage array supplied by a storage vendor, or maybe to a physical or logical partition of this array. It may also map to an aggregation of several arrays, or several partitions residing on the same or different arrays, supplied by the same or different vendors. The implementations of these arrays may include different types of storage hardware and media, e.g. Fibrechannel or SATA disk drives, or optical disks or tape drives...

Data can be attached to any of the primary objects. XAM defines the unit of data that can be attached to a primary object as a field, of which there are two kinds: properties and XStreams. Properties are used to contain simple kinds of data (strings, integers, etc.), and have a simple set/get style API. XStreams are used to contain larger and potentially more complex data (JPEGs, XML files, or binary data) and are accessed as a stream of data through a read/write style API. Regardless of the object to which the field is attached, the same XAM field-manipulation APIs are used; they are scoped to the appropriate object on which they operate (XAM Library, XSystem, or XSet).

XAM Query Language Grammar: The XAM Query Language (XAM QL, or XAMQuery) is modelled on the SQL select statement. Two parts to the statement allow the application writer to control the contents of the query. The first part (the select clause) specifies that the application is requesting a list of XUID values. Unlike SQL, the return value ".xset.xuid" is required, and shall be the only allowable value. The second part (the where clause) allows specification of a subset of XSets to be returned in the results. For XAM 1.0, the select clause shall be present and contain only the keyword "select .xset.xuid". The second part of the query, the where clause, is optional and provides the greatest amount of application control...

XAM query should not be confused with the more general-purpose SQL relational databases. XAM query is not intended to provide the same performance guarantees seen in a mature relational database management system. XAM Storage Systems are generally designed to be archives of data, rather than relational databases. Example uses of query include locating the following types of records: (1) Archived medical data records for a patient; (2) A collection of telephone data records referencing some phone number; (3) A computer backup data set containing a named file. Refinements of these basic searches can be extended using the XAM query relational operators to narrow the search..."

XSet Canonical Format

Section 8.8.4 of the XAM Architecture Specification defines the XSet Canonical Format:

The main goals for the canonical format are interoperability and performance. Interoperability is required so that XSets can be moved between different XSystems without a loss of data. Good performance is required to enable XSystems to efficiently export or import large numbers of XSets in a reasonable time. Secondary goals are the use of existing standards and parsers, where possible, and the ability to do offline inspection of exported XSets. The format should also support the ability to describe multiple XSets within the same data.

With these design goals in mind, the canonical XSet format shall be packaged in two main parts: an XML document describing the properties and streams of the XSet, or XSet manifest, and the binary representation of the streams. Since properties are compact, they shall be fully defined in the XML document. Since XStreams can be rather large, only the attributes of the stream shall be included in the XML document; the actual contents shall be outside the XML document in a separate part of the package. The format of the package shall adhere to the XML-binary Optimized Packaging (XOP) format. With this format, the binary data is attached as MIME attachments following the XML document [application/xop+xml media type]. The MIME attachments shall conform to the MIME Multipart/Related Content-type specification as defined in RFC 2387.

A table of contents attachment shall be added as the first attachment to list the offsets of each of the XStream's binary data. This attachment allows for parallel access to the XStreams if desired. The format of the XML document shall allow multiple XSets to be included in one canonical XSet package, although the current version of the specification only imports or exports a single XSet at a time. If the XSet does not contain any XStreams, then no MIME attachments shall follow the XML document. As per the XOP format, the ordering of the MIME parts shall not be considered significant, except for the purpose of determining the root MIME part.

The use of XML enables the use of standard XML parsers. However, XML is not designed to handle binary data efficiently; it requires Base64 encoding, thus expanding the size of the data by 33%. By using XOP, the applications can use existing XML tools to parse and understand the XSet structure, while not requiring them (and the XSystem) to Base64 encode/decode the XStreams.

Annex B of the XAM Arch-S specification supplies the W3C XSD (XML Schema Definition) for the XML manifest part of the XSet canonical format. This XSD defines the XSet XML manifest that contains the complete definitions and content of properties and definitions of XStreams for the XSet(s) being imported or exported. The contents of the XStreams are found in the MIME attachments that are contained in the same XOP package as the XML manifest. Declared namespace: http://www.snia.org/2006/xam/export.

The root element for the XSet XML format is "xsets". The format of the XML allows for multiple XSets to be included in a single package, although this version of the specification only specifies the import/export of a single XSet in a package. This element is followed by the element "xset", which contains the manifest for a single XSet. The element contains two sub-elements, "properties" and "xstreams". The "properties" element contains the entries for all the properties exported for this XSet. Each property is described by a "property" element containing the name, stype, read-only, binding, and length attributes of the property. A "value" sub-element contains the actual value of the property.

The next element is the "xstreams" element, which defines the XStream information. Each XStream is represented by an "xstream" element, which contains, as XML attributes, the attributes of the field, like the "property" element. One child element, "xop:include" is mandated by XOP. It contains the content-id (in an "href" element), prefaced by cid:, to the MIME part containing the XStream contents...

Commentary on XAM

  • "SNIA Reduces API Barriers." By Paul Weinberg. From eChannel Line (October 29, 2007). "EMC is the major beneficiary in the Storage Networking Industry Association's eXensible Access Method (XAM) specification for the application program interface (API) that some of the member storage vendors (EMC, HP, Sun Microsystems and Vignette) have negotiated through the industry association. That is the observation of Greg Schulz, founder and principal analyst at Storage I/O... Vincent Franceschini, SNIA's 2007 chairman, estimated that more than 45 companies, including software providers, storage vendors and application developers are participating in the XAM initiative, which is slated to provide the specification next year. He also revealed that a XAM software development kit is being made available..."

  • "Universal Data Retention Specification Demonstrated. XAM Specification Divorces Data from the Applications that Created It." By Lucas Mearian. From Computerworld (October 15, 2007). "Vendors such as EMC Corp., Hewlett-Packard Co., Sun Microsystems Inc. and Vignette Corp. demonstrated their software interfaces for a new specification that offers a universal way for users to store and access unchanging or fixed data regardless of the application that created it. The specification, Extensible Access Method (XAM), was demonstrated at Storage Networking World here for the first time. The specification was announced last spring and is expected to be presented to the American National Standards Institute for review as a standard early next year. XAM essentially acts as a layer of abstraction between operating systems, various fixed content applications — such as e-mail, file or database archiving products — and the management software that accesses the fixed-content data so that users can retrieve that data no matter what application created it..."

  • "Entering the Digital Dark Ages?" By Jason Snyder. From InfoWorld (August 06, 2007). "Welcome to the Digital Dark Ages — an era of unprecedented information gathering likely to leave no lasting impression on the future, thanks in large part to a cross-departmental lack of understanding of the business requirements for data archiving. Or at least that's the tenor of a recent study conducted by SNIA's 100 Year Archive Task Force, undertaken to shed light on the long-term fate of digital information as dictated by today's datacenter migration and archiving policies. Chief among the SNIA task force's post-survey directives is to get application providers and storage system vendors on the same page when it comes to providing organizations with a means for reproducing original content unaltered over time. To achieve this, SNIA is betting on XAM (eXtensible Access Method), which it believes will provide much-needed metadata communication between applications and storage systems, thereby easing the ability to move data around in heterogeneous storage environments without application awareness..."

  • "What is Fixed-Content Storage?" By Mark A. Carlson. Blog (June 20, 2007). "[Fixed content is] data that doesn't need to change once it is created. Likely that data will need to be retained for some period of time, and will also need to be deleted at some point. This data could be medical records, media such as images, video or audio, engineering documents, and so on. An important concept associated with this data is the metadata (or data about this data) that also needs to be retained. For example, the metadata for medical records could include patient info, type (x-ray, CAT scan, etc.), doctor, radiologist, disease diagnosis, and so forth. The traditional way to store the data and metadata might be to put the metadata in a database and the actual data in a file, but then how do you ensure that the same retention policy is applied to the data stored in these two disparate storage mechanisms? Note that retention policies can be influenced by regulation (Sarbanes-Oxley), corporate policy, and even legal action. So this is where Fixed-Content Aware Storage comes in. The idea is that the storage device itself implements the retention policies for both the stored data and the stored metadata..."

  • "XAM Working Group to Standardize Metadata." By Lucas Mearian. From Computerworld (April 17, 2007). "A new working group announced at Storage Networking World yesterday will help vendors and users write standardized interfaces between applications that generate fixed content and the storage platforms that store it. The working group will focus its efforts around the Extensible Access Method (XAM) standards proposal from the Storage Networking Industry Association (SNIA), which is leading the effort to create a protocol for the way organizations store metadata that describes fixed content, such as e-mails, medical records and financial data. The XAM working group will create a software development kit to help vendors and corporations write their applications to the proposed XAM specification. The XAM standards proposal, which is expected to be completed by early 2008 for review by the American National Standards Institute, will be aimed at e-mail archival first, according to Wayne Adams, SNIA's chairman emeritus..."

  • "Vendors Debate Different XAM Strategies." By Jo Maitland. From TechTarget.com (October 2006). "Developing standards in the storage industry is like achieving consensus at the United Nations--it takes a while. The Extensible Access Method (XAM) Interface, which would define a single access method for archiving devices like EMC's Centera and Hewlett-Packard's (HP) Reference Information Storage System (RISS), is no exception. On the surface, XAM appears to have everyone's support. EMC, HP, Hitachi Data Systems (HDS), IBM and Sun Microsystems are all in the same camp. All agree that a single API — instead of proprietary APIs developed by each vendor — will grow the market much faster... The question is whether vendors can agree on the best format for the standard. Three approaches are up for discussion: a client-side API or driver, a pure protocol-only approach and a file-system version."

  • "SNIA Pushes Extensible Access Method Standard Forward." By Brian Fonseca. From eWEEK (December 22, 2005). "Aiming to create a new set of application interfaces to enable different vendor storage systems to more easily communicate with each other, as well as share and store sensitive data for growing compliance concerns, the Storage Networking Industry Association is currently reviewing a proposed specification interface called Extensible Access Method. X-Access Method, or XAM for short, is on the fast track to becoming a ubiquitous standard for reference information — data that changes infrequently once written — and fixed content access in the coming years. In fact, XAM could make its way toward partial adoption as part of a SDK (software development kit) by late 2006 or in 2007..."

Related: Self-Describing Self-Contained Data Format (SD-SCDF)

Self-Describing, Self-Contained Data Format (SD-SCDF) is a SNIA proposal for "an open logical format standard based on marrying the OAIS-'Archival Information Package' with XAM. The intent is that through a XAM-based approach, applications can write an information object which solves the problems associated with maintaining the logical readability of information over very long periods of time. The two biggest obstacles in long-term preservation of information are physical and logical migration. SD-SCDF is SNIA's new program to define standards to solve the logical migration problem." SNIA's Long-Term Archive and Compliance Storage Initiative (LTACSI) includes the activities of the 100 Year Archive Task Force. As reported in the September 2007 presentation for "SNIA Data Management Forum Re-launch LTACSI and ATF," current DMF goals are to ensure coordination with the with the Fixed Content Aware Storage Technical Working Group (FCAS TWG) to incorporate SD-SCDF functionality, to define the SD-SCDF standard, and to define application use cases for SD-SCDF. The XML Library includes an SD-SCDF (Self-Describing, Self-Contained Data Format) container.

From: "Towards Self-Describing Self-Contained Data Format (SD-SCDF)." By Simona Cohen. SNIA 100 Year Archive Task Force. April, 2007. (adapted)

OAIS (Open Archival Information System) is an ISO [ISO 14721:2003 OAIS] standard for preservation that provides fundamental ideas, concepts and a reference model for long-term archives. OAIS Archival Information Package (AIP) Logical Structure supports: (1) Content Data Object — the raw data that is the focus of the preservation; (2) Representation Information — the information required to interpret the raw data to its designated community; (3) Reference — globally unique and persistent identifiers for the content information; (4) Provenance — the history and the origin of the content information and any changes that may have taken place since it was originated, and who has had custody of it since it was originated; (5) Context — documents reason for creation of the content information and relationship to its environment; (6) Fixity — a demonstration that the particular content information has not been altered in an undocumented manner.

XAM Canonical XSet Format is relevant to the goals of a Self-Describing Self-Contained Data Format. XSet is packaged according to the XML-binary Optimized Packaging (XOP) format. The XSet package includes: (1) XML document describing the compact parts of the XSet, namely attributes and properties; (2) XStreams actual contents is attached as MIME attachments following the XML document; the MIME attachments shall conform to the MIME Multipart/Related Content-type specification as defined in RFC 2387; (3) A Table of Contents attachment shall be added as the first attachment to list the offsets of each of the XStreams binary data — this allows for parallel access to the XStreams if desired; (4) As per the XOP format, the ordering of the MIME parts shall not be considered significant, except for the purpose of determining the root MIME part.

Proposal: SD-SCDF is a proposal for an open logical format standard based on marrying the OAIS AIP with XAM. The data format will include OAIS concepts: RepInfo, reference, provenance, fixity, context. It will use XOP for a cluster of AIPs: Each AIP is a set of XSets [but] does a media unit (e.g. volume) contain one XOP package or many XOP packages? We need to add inter-link and external-link mechanism. Include TOC that points to the various AIPs on the media. VTL could be a natural translation point and search staging area for off line data.

SD-SCDF Contact Information: Gary Zasman (Chair 100 Year Archive Task Force) or Simona Cohen (SD-SCDF Lead Developer, 100 Year Archive Task Force (IBM Haifa Labs).

Principal References


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Bottom Globe Image

Document URI: http://xml.coverpages.org/ni2007-11-02-a.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org