The Cover PagesThe OASIS Cover Pages: The Online Resource for Markup Language Technologies
SEARCH | ABOUT | INDEX | NEWS | CORE STANDARDS | TECHNOLOGY REPORTS | EVENTS | LIBRARY
SEARCH
Advanced Search
ABOUT
Site Map
CP RSS Channel
Contact Us
Sponsoring CP
About Our Sponsors

NEWS
Cover Stories
Articles & Papers
Press Releases

CORE STANDARDS
XML
SGML
Schemas
XSL/XSLT/XPath
XLink
XML Query
CSS
SVG

TECHNOLOGY REPORTS
XML Applications
General Apps
Government Apps
Academic Apps

EVENTS
LIBRARY
Introductions
FAQs
Bibliography
Technology and Society
Semantics
Tech Topics
Software
Related Standards
Historic
Created: July 18, 2001.
News: Cover StoriesPrevious News ItemNext News Item

'Regular Fragmentations' Tool for Fragmenting Textual Content Into XML Elements.

Simon St.Laurent (O'Reilly & Associates) has released a Java SAX Filter called 'Regular Fragmentations' which uses regular expressions to fragment content into XML elements. "Regular fragmentations are an approach to processing textual content as if it had been represented as more finely-grained markup. The XML Schema Dataypes specification, for instance, offers a number of lexically compound types among its primitive types, requiring developers to rely on extension functions or XML Schema processing to manipulate them with XSLT. Regular fragmentations allow developers to specify the application of regular expression to element content (attribute content coming soon!) using an XML-based rules syntax. An open source SAXFilter implementation allows the use of regular fragmentations in a wide variety of XML processing environments... XML developers are constantly faced with questions about how fine-grained their data structures should be, and the difficult problem of dealing with cases where other people chose coarse-grained structures. While tools like XSLT can do an excellent job retrieving needles from haystacks, it's much easier to extract needles that are labelled and cleanly separated from the surrounding content. The com.simonstl.fragment package allows developers to specify rules using regular expressions which are applied to element content during the parsing process. While the document is parsed, those rules are applied to the textual content of the specified elements and new child elements are created, adding extra markup information to the document."

"Regular Fragmentations: Treating Complex Textual Content as Markup." By Simon St. Laurent (O'Reilly & Associates). Paper to be presented at Extreme Markup Languages 2001, August 12-17, 2001, Montréal, Canada. "Regular fragmentations are an approach to processing textual content as if it had been represented as more finely-grained markup. The XML Schema Dataypes specification, for instance, offers a number of lexically compound types among its primitive types, requiring developers to rely on extension functions or XML Schema processing to manipulate them with XSLT. Regular fragmentations allow developers to specify the application of regular expression to element content (attribute content coming soon!) using an XML-based rules syntax. An open source SAXFilter implementation allows the use of regular fragmentations in a wide variety of XML processing environments."

Principal references:


Hosted By
OASIS - Organization for the Advancement of Structured Information Standards

Sponsored By

IBM Corporation
ISIS Papyrus
Microsoft Corporation
Oracle Corporation

Primeton

XML Daily Newslink
Receive daily news updates from Managing Editor, Robin Cover.

 Newsletter Subscription
 Newsletter Archives
Bottom Globe Image

Document URI: http://xml.coverpages.org/ni2001-07-18-c.html  —  Legal stuff
Robin Cover, Editor: robin@oasis-open.org