OASIS Unstructured Information Management Architecture (UIMA) TC
OASIS Forms Committee to Standardize Content Analytics
Complementary Open Standards and Open Source Efforts Launch
EMC, IBM, SAIC, Thomson, and Others Collaborate on Unstructured Information Management Architecture (UIMA)
Boston, MA, USA. November 16, 2006.
Open standards consortium, OASIS, has initiated a new effort aimed at standardizing semantic search and content analytics. The work of the OASIS Unstructured Information Management Architecture (UIMA) Technical Committee will advance a common method for meaningfully accessing data contained in text such as e-mails, blog entries, news feeds, and notes, as well as in audio recordings, images, and video. The OASIS work will be complemented by an Apache Software Foundation incubator project for developing UIMA-based open source software.
"UIMA will enable the productive use of content that exists as natural language text, speech, and video — information created by humans for humans to understand," explained David A. Ferrucci of IBM, convener of the OASIS UIMA Technical Committee. "By assigning semantics to this content, UIMA will allow information to be exploited by database management systems, information retrieval systems, and other traditional application infrastructure."
OASIS will refine and finalize a set of UIMA specifications based on an initial contribution from IBM with input from DARPA, Carnegie Mellon University, Columbia University, Stanford University, University of Massachusetts-Amherst, MITRE Corporation, and Science Applications International Corporation (SAIC).
"Our goal is to define a platform-independent specification that supports the interoperability, discovery, and composition of analytics across modalities, domain models, and frameworks," noted Eric Nyberg, Associate Professor at the School of Computer Science, Carnegie Mellon University. "By enabling enterprises to access the intelligence contained in their unstructured information, UIMA will empower organizations to uncover relationships, identify patterns, and predict outcomes."
"Dynamic discovery and negotiation of diverse content, and smart consumption, will be essential 21st-century processing skills. UIMA offers an exciting solution," observed James Bryce Clark, director of standards development at OASIS. "We're pleased to see another instance of the virtuous circle between complementary open standards and open source development projects. An open standard will permit multiple devices and implementations to talk to each other about multi-modal information; the open source project will help a broad range of users take advantage of the growing global body of multi-modal content and analytics."
The OASIS UIMA Technical Committee will operate under the Royalty Free on Limited Terms mode, as defined by the OASIS Intellectual Property Rights Policy. Participation in the Committee remains open to all companies, non-profit groups, governments, academic institutions, and individuals. As with all OASIS projects, archives of the Committee's work will be accessible to both members and non-members, and OASIS will host an open mail list for public comment.
OASIS (Organization for the Advancement of Structured Information Standards) is a not-for-profit, international consortium that drives the development, convergence, and adoption of e-business standards. Members themselves set the OASIS technical agenda, using a lightweight, open process expressly designed to promote industry consensus and unite disparate efforts. The consortium produces open standards for Web services, security, e-business, and standardization efforts in the public sector and for application-specific markets. Founded in 1993, OASIS has more than 5,000 participants representing over 600 organizations and individual members in 100 countries. Approved OASIS Standards include AVDL, BCM, CAP, DITA, DocBook, DSML, ebXML CPPA, ebXML Messaging, ebXML Registry, EDXL-DE, EML, OpenDocument, SAML, SOA-RM, SPML, UBL, UDDI, WSDM, WS-Notification, WS-Reliability, WSRF, WSRP, WS-Security, XACML, XCBF, and XML Catalogs.
For More Information
- Unstructured Information Management Architecture (UIMA) - IBM web site
- "Towards an Interoperability Standard for Text and Multi-Modal Analytics." By David Ferrucci, Adam Lally, Daniel Gruhl, Edward Epstein, Marshall Schor, J. William Murdock, Andy Frenkiel, Eric W. Brown, Thomas Hampp, Yurdaer Doganata, Christopher Welty, Lisa Amini, Galina Kofman, Lev Kozakov, and Yosi Mass. IBM Research Report. RC24122 (W0611-188). IBM Research Division (Thomas J. Watson Research Center, Almaden Research Center, Haifa Research Laboratory) and IBM Software Group (Boeblingen, Germany). November 28, 2006. 106 pages. "This report motivates and proposes elements of an architecture specification for creating and composing text and multi-modal analytics for processing unstructured information, based on the UIMA project originated at IBM Research." Posted to the OASIS Unstructured Information Management Architecture (UIMA) TC list by TC Chair David A. Ferrucci, PhD [document source]
- Apache UIMA, a project of the Apache incubator
- uima-dev list at 'incubator.apache.org'. Post mail to email@example.com.
- IBM announcement
- IBM Continues Open Source UIMA Push From CBR Online.
Prepared by Robin Cover for The XML Cover Pages archive.