Proposal: OpenCMIS Apache Incubator Project for CMIS
OpenCMIS Incubator for Content Mangement Interoperability Services (CMIS)
- Apache OpenCMIS Proposal
- OpenCMIS Architecture
- Chemistry and OpenCMIS APIs
- CMIS Implementation Experiences
OpenCMIS will deliver a Java implementation of the OASIS CMIS specification.
OpenCMIS provides a Java implementation of the OASIS CMIS specification. This includes a library to connect as a consumer to a CMIS repository, and a library to provide the CMIS protocol handlers on top of an existing repository. All the protocol bindings defined by the CMIS specification will be supported.
The OASIS CMIS (Content Management Interoperability Services) specification is a standardization effort to enable interoperability of Enterprise Content Management (ECM) Systems. Like SQL became the standard for accessing database systems, CMIS aims to become a similar standard for accessing document management systems. CMIS was started by IBM, EMC and Microsoft. Most of the ECM vendors joined the OASIS Technical Committee (TC) for CMIS in the meanwhile as well.
The need for a common, open source CMIS library came up during the standardization work. David Caruana, David Ward, Florian Mueller, Jens Huebel, Paul Goetz, Martin Hermes, and Stephan Klevenz from Alfresco, Open Text and SAP started an initiative and design outline to found this project. Code and some design ideas from an existing open source project owned by Florian Mueller was an initial contribution to the project.
The aim is to build an object oriented Java implementation of CMIS that encapsulates the CMIS protocol bindings, mainly to support clients using CMIS. Focus of this project it to support the needs of an enterprise environment, that is reliability, performance, and monitoring.
With CMIS being adopted by various ECM vendors, there is a strong need for repositories and applications dealing with content to support CMIS. As CMIS defines a domain model and protocol bindings, Java developers would have to implement the protocol bindings from scratch.
The CMIS specification focuses on the protocols, and is therefore service oriented. An object oriented API which encapsulates this services makes it easier for Java developers to use CMIS. In turn, easy adoption of CMIS by Java applications should help the standard becoming widely adopted.
- Implement the CMIS 1.0 protocol binding for SOAP
- Implement the CMIS 1.0 protocol binding for AtomPub
- Implement a library with an object oriented API to encapsulate the CMIS protocol bindings for consumers
The OpenCMIS contributors recognize the desirability of running the project as a meritocracy. We are eager to engage other members of the community and operate to the standard of meritocracy that Apache emphasizes; we believe this is the most effective method of growing our community and enabling widespread adoption.
The OASIS Technical Committee (TC) is the community for the CMIS standard definition. Most of the TC members provide Java based ECM implementations, and are also interested to help building a CMIS library for Java.
The project was started by Florian Mueller (Open Text) and Jens Huebel (Open Text). David Caruana (Alfresco) contributed, as well as Martin Hermes (SAP), Stephan Klevenz (SAP) and Paul Goetz (SAP).
The contributors are working for companies relying on this library. There is minimal risk of this work becoming non-strategic. The contributors are confident, that a larger community will form within the project.
Inexperience with Open Source
The initial committers have varying degrees of experience with open source projects. There is limited access experience developing code with an open source development process. We do not, however, expect any difficulty in executing under normal meritocracy rules.
The initial committers work for different companies (Open Text, Alfresco, and SAP). They work for different projects and knew each other only to their participation in the OASIS TC.
Reliance on Salaried Developers
Although the initial committers are salaried developers, OpenCMIS development was done both on work time and spare time. As the OpenCMIS library will be used in commercial products, some of the companies will dedicate work time to the project.
Relationships with Other Apache Products
OpenCMIS uses other Apache Products (Commons Codec, Commons Logging, CXF is planned). Maven is used as build infrastructure.
A Excessive Fascination with the Apache Brand
The developers of OpenCMIS could use other channels to generate publicity. We hope that the Apache brand helps to build a vendor independent, truly interoperable CMIS library. We would feel honored at getting the opportunity to join.
 Information about the OASIS CMIS Technical Committee can be found at: http://www.oasis-open.org/committees/cmis
 The announcement of the public review for the CMIS 1.0 specification containing the links to the specification) can be found at: http://lists.oasis-open.org/archives/tc-announce/200910/msg00015.html
The current implementation can be found on http://svn.berlios.de/svnroot/repos/opencmis
Source and Intellectual Property Submission Plan
- The initial source (see above)
- Additional source from Open Text developers (CLA in progress)
- Additional source from SAP developers (CCLA filed, CLA in progress)
- Additional source from Alfresco developers (CLA filed)
- The domain opencmis.org from Alfresco
All the external dependencies of the initial codebase comply with Apache licensing policies:
- Apache Commons (Apache v2.0)
- Apache Maven (Apache v2.0)
- Sun JAXB and JAX-WS (CDDL v1.0, GPL v2)
- JUnit (CPL v1.0)
OpenCMIS does not implement or use cryptographic code.
- opencmis-private (with moderated subscriptions)
Web Site: Confluence (OpenCMIS)
- Florian Mueller (Open Text)
- Jens Huebel (Open Text)
- David Caruana (Alfresco)
- David Ward (Alfresco)
- Martin Hermes (SAP)
- Stephan Klevenz (SAP)
- Paul Goetz (SAP)
The initial committers listed are employed by Open Text, Alfresco, and SAP. One objective of the incubator is to extend the community of contributors, we assume that future contributors will have other affiliations.
- Looking for mentors
Subject: [PROPOSAL] OpenCMIS incubator for Content Mangement Interoperability Services (CMIS) From: "Goetz, Paul" <email@example.com> To: "firstname.lastname@example.org" <email@example.com> Date: Wed, 9 Dec 2009 18:21:21 +0100 Message-ID: <2CDD44F49E3D47459EB0282354AE42AE22ADC47842@DEWDFECCR05.wdf.sap.corp>
We would like to propose a new incubator podling called OpenCMIS. Please find below the plain-text version of the proposal. Any feedback would be greatly appreciated.
Best regards, Paul
OpenCMIS Architecture Subject: RE: [PROPOSAL] OpenCMIS incubator for Content Mangement Interoperability Services (CMIS) Date: Thu, 10 Dec 2009 10:03:29 +0100 Message-ID: < 56C255F88C54014E92512ED3E7848F9404609D6B@MUCXGC1.opentext.net> From: Florian Mueller <firstname.lastname@example.org>
I can talk a bit about the OpenCMIS architecture. That might help to distinguish it from Chemistry.
OpenCMIS consists of two layers. We call them Provider layer and Client layer.
The Provider layer implements and hides the CMIS bindings. The API of the Provider layer maps the CMIS domain model. That is, the CMIS specification can be used as the documentation of the Provider layer. There are the same services, the same operations and the same parameters. The AtomPub and Web Services bindings are hidden behind this API. The application does not need to know in advance which binding it will eventually use.
This layer is fully implemented expect for some details. There are some open spec issues that prevent us from finalizing it. It needs extensive testing, though.
Although the Provider layer allows fine-grained control over the calls to the CMIS server it doesn't provide a nice Java-like interface. It deals with immutable data objects.
The Client layer sits on top of the Provider layer and provides this nice Java-like interface. It has all the classes and methods you would expect from an object-oriented interface. We also will make sure that it fits into enterprise framework environments.
We are currently designing the API of this layer. The proposals are not public yet.
Application developers can choose which API is the most suitable for their use case. If fine-grained control or cachable and serializable objects are relevant than the Provider layer is the right choice. If a nice interface and framework integration is important the Client layer is the better option.
Regarding the instability of the CMIS spec: Yes, there are still open issues but those are details. We and other companies were confident enough to spend at lot of energy to implement CMIS and do these small corrections later. It's the right time to implement the CMIS spec.
I hope that helps.
Chemistry and OpenCMIS APIs Subject: RE: [PROPOSAL] OpenCMIS incubator for Content Mangement Interoperability Services (CMIS) Date: Fri, 11 Dec 2009 19:10:38 +0100 Message-ID: < 56C255F88C54014E92512ED3E7848F9404679495@MUCXGC1.opentext.net> From: Florian Mueller <email@example.com> To: <firstname.lastname@example.org> Cc: "Incubator-General" <email@example.com>
In the end the APIs should be somewhat similar since they are implementing the same spec.
But you are actually comparing two different levels of APIs. The opencmis-provider-api handles simple immutable data objects while chemistry-api follows an object-oriented approach. As far as I know Chemistry has nothing comparable to the opencmis-provider-api. The opencmis-client-api would be the right level to look at but the code of this API is not in SVN yet. We will make available on Monday.
The APIs are not the main reason why I think that Chemistry and OpenCMIS are different. I would like to avoid the word "superior". I never used that in this context. Both projects came from a different background that's why they are different.
Chemistry uses Abdera to communicate with the server while OpenCMIS is based on JAX-B and some CMIS specific XML coding. There is a lot of code sharing between the AtomPub and the Web Services binding. (I couldn't find a Web Services client in Chemistry. So I can't comment on that.) OpenCMIS has a caching infrastructure that is specific to CMIS and how OpenCMIS work. There is nothing like that in Chemistry. The overall architecture and principals below the API are very, very different. Bringing both together would require philosophy changes on both sides. I'm not saying that this isn't possible, but it's a lengthy process.
We derived our design from a lot of prototypes and applications that we have built over the past 20 months. Some code fragments and concepts are actually pretty old now. We had a lot of it in one shape or another when Chemistry started. That's why Chemistry was never an option for us. The code bases of Chemistry and OpenCMIS have been developed at the same time taking different routes. Chemistry did that in the public, most of OpenCMIS was created behind closed doors.
Here we are with a working code base that we cannot give up and that we will maintain in the future for obvious reasons. Our idea was to make it Open Source and let others benefit from our work. Apache seemed to be the right place - at least three days ago. It was never meant to be an attack against Chemistry.
Subject: CMIS Implementation Experiences Date: Tue, 15 Dec 2009 16:38:23 +0100 Message-ID: < 56C255F88C54014E92512ED3E7848F9404679781@MUCXGC1.opentext.net> From: Florian Mueller <firstname.lastname@example.org> To: <email@example.com>
I would like to foster the technical discussion between the Chemistry team and the people behind the OpenCMIS proposal. If you think this is inappropriate on this list, please let me know.
In order to explain the rationale behind the OpenCMIS design I would like to talk about some of the experiences that we made with CMIS client and server implementations.
We also started with Abdera on the server side. It turned out to be more pain than joy. With a pure JAXB design we ran into compatibility issues. A good tradeoff between efficiency, correctness and maintainability seems to be StAX with JAXB. OpenCMIS handles all AtomPub related tags with StAX and all CMIS related data with JAXB. The JAXB objects are not exposed to the application. They are just interim objects. The same StAX/JAXB design should work on the server side as well. The effort to implement AtomPub is manageable. I've done this in my CMIS FileShare project.
Another detail we learned is that implementing both bindings in parallel saves you a lot of refactoring later. Both CMIS bindings are really different. If you align your classes and flows to just one binding you might have to refactor a lot later to make the other binding work smoothly. This insight is reflected in OpenCMIS in two areas. First of all, there is a strict decoupling of the binding implementation (Provider layer) and the nicer Java API (Client layer). If somebody would show up with a third CMIS binding we just have to touch the Provider layer. The second area is within the Provider layer. We tried to reuse as much code and concepts as possible between both binding implementations. For example, both binding implementations share the generated JAXB classes, the caching infrastructure and several utilities.
We introduced type (and repository info) caching based on our experiences with applications using a CMIS library. Applications need type information all over the place and it is expensive to fetch them over and over again. From a library perspective one can argue that caching should be done a level above the library. From practical standpoint it would be nice if it is done once and right. So we decided to put it into OpenCMIS. If an application doesn't want it, it can switch it off. The caching works implicitly. Whenever a type definitions runs through the library the data is cached or refreshed.
CMIS provides no mechanism to detect type changes. So there is a slight chance that the type cache holds outdated data. In an enterprise scenario (and that's what OpenCMIS is aiming at) type changes shouldn't happen often. They are usually interconnected with an update or re-deployment of the application. A paranoid application developer can switch off the cache (and accept the performance penalty) or clear the cache regularly (every hour or every five minutes or every 30 seconds...) or create a new session once a while. Since sessions are bound to logins there is a regular exchange of sessions and therewith caches, anyway.
Another aspect that we think is important are extensions. CMIS defines a lot of extension points and repositories will make use of it sooner or later. Application should be able to access and set extension data. Sure, it is against the idea of a standard but it will happen and the library should be prepared for that. The difficult part here is to make the binding invisible to the application since some extension points are very binding specific. Using JAXB in both bindings covers a lot but not everything. OpenCMIS has the infrastructure in place but is not perfect in this regard, yet.
I hope that's the beginning of a fruitful conversation,
Prepared by Robin Cover for The XML Cover Pages archive.