[Cache from http://www.hit.uib.no/claus/mlcd/papers/texmecs.html 2001-05-10; please use this canonical URL/source for later/revised/authoritative versions if possible.]
document ::= /* */ | document chunk chunk ::= sole-tag | start-tag | end-tag | suspend-tag | resume-tag | start-tag-set | end-tag-set | virtual-element | internal-entity | external-entity | character-ref | cdata-section | comment | datacharacter
sole-tag ::= '{' eid atts '}' virtual-element ::= '{^' eid '^' idref atts '}' /* WFC: idref OK */WFC: idref OK. The idref value in a virtual element must appear on some element in the document as the value of an id.
start-tag ::= '{' eid atts '{' /* WFC: end-tag match */ end-tag ::= '}' gi '}' /* WFC: start-tag match */ start-tag-set ::= '{|' eid atts '|{' /* WFC: end-tag-set match */ end-tag-set ::= '}|' gi '|}' /* WFC: start-tag-set match */WFC: end-tag match Each start-tag must be paired with a matching end-tag appearing later in the data stream. Two tags match if and only if they have the same generic identifier (gi) including any suffix. A start-tag s and an end-tag e are paired
suspend-tag ::= '}-' gi '}' /* WFC: suspend-tag OK */ resume-tag ::= '{+' gi '{' /* WFC: resume-tag OK */WFC: suspend-tag OK Each suspend-tag must be paired both with (a) a matching start-tag or resume-tag appearing earlier in the data stream, and (b) a matching resume-tag appearing later in the data stream. A suspend-tag t is paired with a preceding start- or resume-tag rs if and only if:
internal-entity ::= '{&' NAME '}' | '{&' NAME S '.' S NAME '}' /* CF: structured internal entities */ external-entity ::= '{<' URL '>}' /* CF: external entities */
character-ref ::= '{#' [dD] [0-9]+ '}' | '{#' [xX] [A-Fa-f0-9]+ '}'
cdata-section ::= '{#CDATA{' cdchars '}#CDATA}' /* CF: CDATA sections */ cdchars ::= CHAR* - (CHAR* ('{#CDATA{' | '}#CDATA}') CHAR*)
comment ::= '{*' commcontent '*}' commcontent ::= /* */ | commcontent commentdata | commcontent comment commentdata ::= CHAR+ - (CHAR* ('{*' | '*}') CHAR*)
eid ::= gi? "@" id? gi ::= NAME SUFFIX? | SUFFIX id ::= NAME /* WFC: unique ID */ idref ::= NAME /* WFC: idref OK */
atts ::= avs* S? avs ::= S NAME S? '=' S? QUOTED(Note: avs is short for ‘attribute-value specification’.)
CHAR ::= [#x00-#x7F]From the point of view of the lexical scanner, the data stream contains either single characters which are legal characters, or else it contains curly braces which have been escaped with a backslash.
CHAR ::= ([#x00-#x7F] - [{}]) /* braces cannot occur 'naturally' */ | '\{' /* but may occur if escaped */ | '\}'
CHAR ::= [#x00-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */(Note, however, that if a system were to cheat and have the lexical scanner simply return 16-bit characters, including surrogates, it would make no difference for TexMECS parsing.)
datacharacter ::= CHAR
NAME ::= Nameinit Namechar* Nameinit ::= [a-zA-Z_] Namechar ::= Nameinit | [0-9] | ':' | '.' | '-' SUFFIX ::= '~' Namechar*Note that while formally names can begin with underscores, it is recommended that underscores be used only in the translation of MECS poly-element codes.
QUOTED ::= '"' dqstring '"' | "'" sqstring "'" dqstring ::= ((CHAR - '"') | internal-entity | character-ref)* sqstring ::= ((CHAR - "'") | internal-entity | character-ref)*
S ::= (#x20 | #x9 | #xD | #xA | #x85 | #x2028 | #x2029)+
URL ::= (CHAR - '>')*
EETAGO = "{" // empty-element tag open and close EETAGC = "}" STAGO = "{" // start-tag STAGC = "{" ETAGO = "}" // end-tag ETAGC = "}" ITAGO = "}" // suspend (interrupt) tag ITAGC = "}" RTAGO = "{+" // resume-tag RTAGC = "{" STAGSO = "{|" // start-tag for set STAGSC = "|{" ETAGSO = "}|" // end-tag for set ETAGSC = "|}" VTAGO = "{^" // virtual element tag open VTAGC = "}" VSEP = "^" // virtual-element gi/target separator SUFFIXSEP = "~" // suffix separator (for self-overlap) ERO = "{&" // entity-reference open ENVS = " . " // entity name-value separator ERC = "}" EERO = "{<" // external-entity reference open, close EERC = ">}" COMO = "{*" // comment open, close COMC = "*}" MDO = "{!" // markup declaration open, close MDC = "!}" CDATAMSO = "{#CDATA{" // CDATA marked-section start, close CDATAMSC = "}#CDATA}"
{s{{a{ John {b{ loves }a} Mary }b}}s}
{sp{{speaker{AASE}speaker}{l{Peer, you're lying!}-l}}sp} {sp{{speaker{PEER GYNT }speaker} {stage{without stopping}stage}{+l{No, I'm not!}l}}sp} {sp{{speaker{AASE}speaker}{l{Well then, swear to me it's true.}l}}sp} {sp{{speaker{PEER GYNT}speaker}{l{Swear? why should I?}-l}}sp} {sp{{speaker{AASE}speaker}{+l{See, you dare not!}l} {l{Every word of it's a lie.}l}}sp}
{sp who="AASE"{{l{Peer, you're lying!}sp} {sp who="PEER GYNT"{ {stage{without stopping}stage}No, I'm not!}l}}sp} {sp who="AASE"{{l{Well then, swear to me it's true.}l}}sp} {sp who="PEER GYNT"{{l{Swear? why should I?}sp} {sp who="AASE"{See, you dare not!}l} {l{Every word of it's a lie.}l}}sp}
{sp who="AASE"{{l{Peer, you're lying!}sp} {sp who="PEER GYNT"{ }-l}{stage{without stopping}stage}{+l{No, I'm not!}l}}sp} {sp who="AASE"{{l{Well then, swear to me it's true.}l}}sp} {sp who="PEER GYNT"{{l{Swear? why should I?}sp} {sp who="AASE"{See, you dare not!}l} {l{Every word of it's a lie.}l}}sp}
{sp who="HUGHIE"{{p{How did that translation go?}p} {lg type="haiku"{{l{da de dum de dum,}l} {l=frog{gets a new frog,}l} {l{...}l}}lg} }sp} {sp who="LOUIS"{{p{Er ...}p} {lg{{l=new{it's a new pond.}l}}lg} }sp} {sp who="DEWEY"{ {p{Ah ...}p} {lg{{l=pond{When the old pond}l}}lg} {p{Right. That's it.}p} }sp} {lg{{^l^pond}{^l^frog}{^l^new}}lg}
{lg{ {l{Bloody Mary is the girl I love.}l} {l{Bloody Mary is the girl I love.}l} {l{Bloody Mary is the girl I love.}l} {l{Now ain't that too damn bad.}l} }lg} {lg{ {l{Bloody Mary's chewing betel nuts.}l} {l{She is always chewing betel nuts.}l} {l{Bloody Mary's chewing betel nuts}l} {l{And she don't use Pepsodent.}l} }lg} {lg{ {l{Her skin is tender as {app{ {rdg wit="A"{DiMaggio's}rdg} {rdg wit="B C"{a {app{{rdg wit="B"{baseball}rdg} {rdg wit="C"{leather}rdg}}app} }rdg} }app} glove.}l} ... {l{Now ain't that too damn bad.}l} }lg}
{* Witness A *} {lg@stanza-1{ {l{Bloody Mary is the girl I love.}l} {l{Bloody Mary is the girl I love.}l} {l{Bloody Mary is the girl I love.}l} {l{Now ain't that too damn bad.}l} }lg} {lg@stanza-2{ {l{Bloody Mary's chewing betel nuts.}l} {l{She is always chewing betel nuts.}l} {l{Bloody Mary's chewing betel nuts}l} {l{And she don't use Pepsodent.}l} }lg} {lg{ {l@l3.1{{@L3.1a{Her skin is tender as}} {@dm{DiMaggio's}} {@L3.1z{glove.}}}l} {^l@l3.2^l3.1} {^l@l3.3^l3.1} {l@l3.4{Now ain't that too damn bad.}l} }lg} ... {* Witness B *} {^lg^stanza-1} {^lg^stanza-2} {lg{ {l@B-l3.1{{^L3.1a} a baseball {^L3.1z}}l} {^l^B-l3.1} {^l^B-l3.1} {^l^l3.4} }lg} ... {* Witness C *} {^lg^stanza-1} {^lg^stanza-2} {lg{ {l@C-l3.1{{^L3.1a} a leather {^L3.1z}}l} {^l^C-l3.1} {^l^C-l3.1} {^l^l3.4} }lg}
{lg wit="A B C"{ {l{Bloody Mary is the girl I love.}l} {l{Bloody Mary is the girl I love.}l} {l{Bloody Mary is the girl I love.}l} {l{Now ain't that too damn bad.}l} }lg} {lg wit="A B C"{ {l{Bloody Mary's chewing betel nuts.}l} {l{She is always chewing betel nuts.}l} {l{Bloody Mary's chewing betel nuts}l} {l{And she don't use Pepsodent.}l} }lg} {lg wit="A B C"{ {l~A@A3.1 wit="A"{{l~B@B3.1 wit="B"{{l~C@C3.1 wit="C"{ Her skin is tender as }-l~B}}-l~C}DiMaggio's}-l~A} {+l~B{a leather}-l~B} {+l~C{a baseball{+l~A{{+l~B{ glove. }l~A}}l~B}}l~C} {^l@A3.2^A3.1 wit="A"} {^l@A3.3^A3.1 wit="A"} {^l@B3.2^B3.1 wit="B"} {^l@B3.3^B3.1 wit="B"} {^l@C3.2^C3.1 wit="C"} {^l@C3.3^C3.1 wit="C"} {l@l3.4 wit="A B C"{Now ain't that too damn bad.}l} }lg}
{text{{body{ {lg@Sic{ {l{Ich wirbe umb allez daz ein man}l} {l{ze wereltlîchen fröiden iemer haben sol.}l} {l{daz ist ein wîp der ich enkan}l} {l{nâch ir vil grôzen werdekeit gesprechen wol.}l} {l{lob ich si sô man ander frowen tuot,}l} {l{dazn nimet eht si von mir niht für guot.}l} {l{do swer ich des, sist an der stat}l} {l{dâz ûz wîplîchen tugenden nie fuoz getrat.}l} {l{daz ist in mat.}l} }lg} {lgSsi{ {l{Si ist mir liep, und dunket mich}l} {l{daz ich ir volleclîche gar unmære sî.}l} {l{nu waz dar umbe? daz lîd ich,}l} {l{und bin ir doch mit triuwen stæteclîchen bî.}l} {l{waz obe ein wunder lîhte an mir geschiht,}l} {l{daz si mich eteswenne gerne siht?}l} {l{sâ denne lâze ich âne haz,}l} {l{swer giht daz ime an fröiden sî gelungen baz:}l} {l{der habe im daz.}l} }lg} {lg@Sal{ {l{Als eteswenne mir der lîp}l} {l{dur sîne bœse unstæte râtet daz ich var}l} {l{und mir gefriunde ein ander wîp,}l} {l{sô wil iedoch daz herze niender wan dar.}l} {l{wol ime des deiz sô reine welen kan}l} {l{und mir der süezen arbeite gan.}l} {l{des hân ich mir ein liep erkorn}l} {l{dem ich ze dienste, und wære ez al der werlte zorn,}l} {l{muoz sîn geborn.}l} }lg} {lg@Ssw{ {l{Swaz jâre ich noch ze lebenne hân,}l} {l{swie vil der wære, irn wurde ir niemer tac genommen.}l} {l{sô gar bin ich ir undertân}l} {l{daz ich unsanfte ûz ir genâden möhte komen.}l} {l{ich fröu mich des daz ich ir dienen sol.}l} {l{si gelô´net mir mit lîhten dingen wol:}l} {l{geloube eht mir, swenn ich ir sage}l} {l{die nôt diech inme herzen von ir schulden trage}l} {l{dick inme tage.}l} }lg} {lg@Sun{ {l{Und ist daz mirs mîn sælde gan}l} {l{deich abe ir redendem munde ein küssen mac versteln,}l} {l{gît got deichz mit mir bringe dan,}l} {l{sô wil ichz tougenlîche tragen und iemer heln.}l} {l{und ist daz siz für grôze swære hât}l} {l{und vêhet mich dur mîne missetât,}l} {l{waz tuon ich danne, unsælic man?}l} {l{dâ heb i'z ûf und legez hin wider dâ ichz dâ nan,}l} {l{als ich wol kan.}l} }lg} }body} }text}
{* Lachmann and E *} {text{{body{ {^lg^Sic}{^lg^Ssi}{^lg^Sal}{^lg^Ssw}{^lg^Sun} }body}}text}Manuscripts b and C give the stanzas in a different order:
{* b and C *} {text{{body{ {^lg^Sic}{^lg^Sal}{^lg^Sun} {^lg^Ssi}{^lg^Ssw} }body}}text}Manuscript A, as so often, stands isolated:
{* A *} {text{{body{ {^lg^Ssw}{^lg^Sic}{^lg^Ssi} {^lg^Sal}{^lg^Sun} }body}}text}
Grammar ::= Rule* Rule ::= Nonterminal '::=' Expression Expression ::= Seq ('|' Seq)* | Diff Seq ::= Term+ Term ::= Factor [*+?]? Factor ::= Nonterminal | Literal | Charref | Charset | '(' Expression ')' Diff ::= Term '-' Term Nonterminal ::= [a-zA-Z][a-zA-Z0-9-]* // any simple name Literal ::= ('"' [^"]* '"') | ("'" [^']* "'") // any quoted string Charref ::= '#x'[a-zA-Z0-9]+ // a hex reference Charset ::= '[' '^'? (Charspec)+ ']' // a bracketed group Charspec ::= (SChar ('-' SChar)?) | (Charref ('-' Charref)?) SChar ::= Char - ('^' | '[' | ']' | '-') // no []^- in Charsets! Char ::= [#x32-#x7E]
Bray, Tim, Jean Paoli, and C. M. Sperberg-McQueen, ed. Extensible Markup Language (XML) 1.0. W3C Recommendation. [Cambridge, Sophia-Antipolis, Tokyo]: World Wide Web Consortium, 8 February 1998. Second edition 6 October 2000 ed. Eve L. Maler. http://www.w3.org/TR/REC-xml
Jaakkola, Jani, and Pekka Kilpeläinen. sgrep: search a file for a structured pattern. (Man page for sgrep). http://www.cs.helsinki.fi/u/jjaakkol/sgrepman.html
[Kraus, Carl von, ed.] Des Minnesangs Frühling. Nach Karl Lachmann, Moriz Haupt und Friedrich Vogt neu bearbeitet von Carl von Kraus. 30. Auflage. Leipzig: S. Hirzel, 1950.
Lindsey, William D., Stupid NET Tricks. Posting to w3c-sgml-wg@w3.org, 14 September 1996. http://lists.w3.org/Archives/Public/w3c-sgml-wg/1996Sep/0139.html
Lindsey, William D., lml -- Lambda Markup Language: A.K.A. "Stupid NET Tricks". 15 August 1999. http://www.blnz.com/lml/index.html
Marsh, Jonathan, ed. XML Base. W3C Proposed Recommendation 20 December 2000. [Cambridge, Sophia-Antipolis, Tokyo]: World Wide Web Consortium, 2000. http://www.w3.org/TR/xmlbase/
Sperberg-McQueen, C. M., and Claus Huitfeldt, GODDAG: A Data Structure for Overlapping Hierarchies. Paper presented at the conference Principles of Digital Document Processing, Munich, September 2000.
Number of IDREFs | Specify or take GI and attributes | Take children or nodes as children | Comments |
---|---|---|---|
one | specify | children | This is what MSM has been tending toward. |
one | specify | node | Pointless? Leads to chains. |
one | take | children | Does not allow for retagging (use of virtual elements to represent alternative interpretations of the material, cf. TEI <certainty>) |
one | take | node | No good: leads to chains, and reduplicates nodes on them. |
n | specify | children | Cf. TEI <join> with scope="branches" |
n | specify | nodes | Cf. TEI <join> with scope="root" |
n | take | children | Logical problem: can't take one GI and attribute set if there are n places to take it from. |
n | take | nodes | Logical problem: can't take one GI and attribute set if there are n places to take it from. |