[Schachprobleme.de| ChessML Home| About| FAQ| Examples| Draft| ToDo| Documentation| Download| Links| News| Contact]

Working Draft for ChessML, Version 0.4, Level 0

Date : 23.06.00
Author: Oliver Sick, Bonn, Germany
Status: 1st   Working Draft

Table of contents


This draft describes the attempt to get a XML standardisation of chess. The author learnt a lot during the development phase, especially how difficult it is to define a good markup language. The goal of a fully working ChessML (Chess Markup Language) is definitly not reached by the XML implementation described here, but perhaps the remarks and tries given here can be used to go one step further, so that the goal of a world wide accepted ChessML becomes more realistic.

Foreword on ChessML

In the search of data exchange format for chess problems we have to divide the development into several steps. As agreed by the majority of the people involved in this developement the FIDE album should serve as a measure for the complexity of the standard. The very detailed Fide Album 1989-1991 is a perfect model for a modern publication on chess problems. On the other hand there are a lot of database and solving programs around, such as Popeye, Alybadix,PDB and many other. At least for popeye and the PDB it is very important to get a simple mechanism for the communication of these programs. And this is exactly what we hope to get solved by ChessML.

For chess games the most influental and widely used publication is the Chess Imformant published several times a year. Its worldwide popularity comes from the fact, that due to its multilinguality it can be read by a large majority of the chess players around the world. Nowadays chess databases and playing programs also play a tremendous role in the development of chess. Their uprise in the mid 80's marked a milestone in the chess history. Today the use of such databases and programs is absolutely necessary for almost any strong player and even they stile of chess changed dramatically.
ChessML supports electronic data exchange and processing between the various forms of chess information: journals, books, databases, playing and solving programs.

Finally The author wants to remark that similar constructions can be found at different places. Especially we have to mention Chessgml and drafts of some members of the PCCC Computer group.

Foreword on XML and SGML

The author of this draft is quite new to the field of DTD development. So it was necessary to look around for technical help and guidance. And so it is a lucky fact for the author that there is an excellent book out there, namely the book Developing SGML DTDs, from text to model to markup by Eve Maler and Jeanne el Andaloussi (Prentice Hall, 1995). Of course because of the complexity of the subject the author of this text was not able to get all the fine points and hints given in this book, but it helped to get rid of some newbie errors much quicker than without this book. So many thanks to the authors.
But in the contrast to the title of this book the DTD described here is not a SGML DTD but a XML DTD. The reason for the restriction to XML are manifold, namely All these points are the usual arguments pro XML and contra SGML. They have nothing to do with Chess in special. Of course it would be possible to invent a purely chess specific data language which would probably fit better than XML, SGML or other known standards, but because of the wide acceptance of XML it makes no sense to invent a new standard. If we would invent something new too many things would have to be reinvented.

A Foreword on PGN

The Portable Game Notation (PGN) is a chessdata exchange format most widely accepted by almost all chess related playing and database programs. It was developed mainly for this purpose and it has only a very low complexity. There is only rudimentary possibility for commentaries, meta data storing and almost no support for any kind of chess problem related stuff (such as fairy pieces, non standard rules, non standard board sizes,...). But PGN is successful. So it would be nice to transform this success to a good SGML or XML implementation of (general) chess. If you are not satisfied with this argumentation, there is a more extensive disscusion for the improvement of PGN, see the ChessGML web site.

Goals for the ChessML development

Here we want to colect the points which are most the most urgent in the opinion of the author.These are namely

Questions, Errors and Comments

If you want to make comment or report bugs in the DTD or serious errors in this texts, don't hesitate to send me a mail : chessml@schachprobleme.de

Level 0

As it was noticed at the beginning of this draft we have two standards of publications. Both are the most important publication in its field. They should serve as a measure for the usability of a good implemenation:

THE FIDE ALBUM (chess problems)


Standardizing the FIDE-Album

In the following we use the Fide Album for the years 1989-1991 as a base for our disscusions.

What is the Fide Album ?

The FIDE Album is a collection of (per definitionem) the best chess problems published during a fixed period of years. These (approx. 1000) problems are a choosen among a lot of problems (approx. 8000). The Problems are usually seperated in several different sections namely In each section the problems are ordered first by the number of moves, then by the number of pieces. All problems get a rate of 0-12 points. This is the sum of the rates given by a judging commitee of three judges, which usually consists of very experienced problemists.
There is a lower bound of eight points for a problem need (sven for a a study) to get published in the Album. Most of the problems are published with Exceptions to this scheme are only textual problems, such as mathematical problems, which have often more the structure of a mathematical essay than a chess problem. But these problems are simpler from point of chess standardizing and their standardization is in some way covered by MathML, the XML implementation on mathematical structures.
Finally the newest FIDE Album contains further statistical information on the current and the last FIDE Album, several indices, a foreword and the usual technical informations for a book in general.

standardizing the Chess Imformant

It is interesting to see that up to a certain degree there is a good coincidence of the structure of the Chess Informant with the structure of the of the FIDE Album. The structure of a typical INFORMANT is almost a substructure of the FIDE Album structure described above. So for the first time it is enough to disscuss only the implementation of the FIDE Album in the following. If necessary differences between the usage of the FIDE Album style and Chess Informant style are emphasised and disscussed separately. style.

The ChessML DTD, Version 0.4 Level 0

The Structure of the DTD

The (physical) module structure of the 0.4 Version of ChessML is shown in the diagram below :
  _________      __________________________     ______
 |         |    |                          |    |      |
 | Problem |====|         CHESSML          |====| Game |
 |_________|    |__________________________|    |______|
               /  |          |       |      \
              /   |          |       |       \
         ____/  __|__   _____|__   __|_______ \ ___________
        |    | |     | |        | |          | |           |
        |Move| |Event| |Notation| |References| |Definitions|
        |____| |_____| |________| |__________| |___________| 
The diagram should understood in the following sense:
The horizontal lines ====... connecting CHESSML to problems and games are the main two top structures which can also live in separate sub-structures, but both depend on the five modules below the top CHESSML This separation in sevaral modules is quite helpful for several reasons: The second point is very important, because in general publications of chess games and problems are not very often combined in a particular document such as a book, a database or an internet site. And there is also a very strong initiative to seperate games and problem in general. This can now be done quite easylie.
Now we are going to describe the several parts of the DTD not by using the physical model above but by using the hierachical structure.
There is only one module (namely the move module) which is purely chess related, in contrast to the other modules which can be ssen as chess-stuff adapted versions of everydays DTDs used in letter, bureau and documentation DTDs. Using this modiules Four types of Chess DTDs were generated:
  1. CHESSML : chessml.dtd
  2. CHESSMLITE : chessmlite.dtd
  3. CGML : cpml.dtd
  4. CPML : cgml.dtd
The structure of the DTDs is that CGML.dtd, CPML.dtd, ChessMLite.dtd are pure sub-DTDs of the ChessML.dtd. On the other hand any ChessMLite document containing only chess games (resp. chess problems) is also a CGML (resp. CPML) document. In general CGML and CPML don't validate the same XML documents at the same time.

Leitfaden for the DTD structure

A canonical CHESSML document separates into the following parts: The last two sections may occur in any number and order so that in principle it is possible two mix and refer to chess games and chess problems inside a document in any kind. But usually it is not adviced to mix them. see CPML.dtd resp.CGML.dtd which cover the chess problems resp. chess games separately. Both <chessproblem> and <chessgame> always contain the following two main parts: Similary to the well known counter parts in HTML the head section contains maintaining informations not directly connected to chess, whereas the second part is totally devoted to chess and all non-chess informations should be reduced to a minimum.

Part 0:The introductory part

The introductory part consists of a sequence of sections of three different types:
The <event> or the <references> or a set of <definition>.


A typical <references>part should look like
    <source id="chessinformant" sourcename="The Chess Informant">
      <date year="1994"/>
      <issue id="chessinformant64" nr="64">The Chess Informant 64
      <issue id="chessinformant60" nr="60">The Chess Informant 60
        <section id="ci60game389"/>

The "nr=" Attribute helps to distinguish and order periodical publications (in this case the Chess Informant).
The "id=" Attribute is used to refer in the chess game or problems to the particular sources.
The <section> tag is used for the localization of a certain portion of the publication, in this case it should mean the game 389 in chess informant 60.
For a finer reference mechanism also a hirarchy of <part> , <chapter> , <section> and <subsection> are allowed. This is a strong simplification of the structurization used in LaTeX or in DocBook. Inside <source>.
Further important tags are <author>, <copyright>, <date> <publisher>. Their content is self explanatory. They also have a variety of attributes which will be explained otherwhere.


A typical application for the event section is given in the following code part:
 <event category="tournament" participant-no="??">
     <person><name firstname="G." lastname="Flear"/> </person>
     <person><name firstname="R." lastname="Vera"/>  </person>
     <person><name firstname="???" lastname="???"/> </person>
   . . .
     <list description="1st round results">
        <listitem content="ci64game368">draw
        <listitem content="???">???
        <listitem content="???">???
        . . .
     <list description="2nd round results">
        <listitem content="???">???
        <listitem content="???">???
        <listitem content="???">???
        . . .
     . . .  
(The ??? section should of course be filled by other useful infos).
This section is very rough and must definitely be improved. Perhaps the table model of HTML could serve as a "design" pattern. On the other our event-model is enough for the informations given in the context of the Chess informant.


The definition section plays in general no role in a chess game file (and it is excluded in CGML). It contains definition of the chess pieces, conditions and stipulations. But of course if you want to use chessml for non orthodox chess games, such as Fischer chess, progressive chess and so on this section could be quite useful. But for clarity its usage is described in the chess problem part below.
 See below ...

Part 1: Chess games

Source: The whole chess game as a ChessML file: Example 1.

The <head> section of a chess game

A typical head looks like this:
 <head source="chessinformant64" event="elgoibar" annotator="r. vera" 
     copyright="all copyrights, chess imformant publisher Yugoslavia">
  <white title="gm" rating="2505">
  <black title="gm" rating="2485">
  <date year="1994"/>
This is sufficient to cover the meta informations given for a particular game in the Chess Informant. Other helpfull informations are given by the <date> tag, which can be used in the following way
 <date year="1994" month="12" day="4" time="19:00.00" timezone="CET" type="AD">
"AD" is for Anno dominae, "CET" for Central European Time, and so on... another tag is the <mode> tag which helps to describe the playing mode
 <mode timecontrol="40m-2h, 20m-1h, *-1h">
This could be interpreted as : first 40 moves two hours, next twenty moves one hour and one hour for all following moves. But the content of the timecontrol attribute is not fixed to this kind of content.

The <body> section of a chess game

There are always two kinds of representation of a chessml body: A compact and an extended form.

A typical body in the compact form could look like this:
<body ply-count="40">
  <classification type="ECO" key="B31" keylist="ECOlist"/>
  <moves party="w">
        <d2/><d4/>     <d7/><d5/>
        <c2/><c4/>     <c7/><c6/>
    <N/><b1/><c3/>     <e7/><e6/>
        <e2/><e4/>     <d5/><e4/>
    <N/><c3/><e4/> <B/><f8/><b4/>
    <B/><c1/><d2/> <Q/><d8/><d4/>
    <B/><d2/><b4/> <Q/><d4/><e4/>
    <B/><f1/><e2/> <P/><c6/><c5/>
    <Q/><e4/><g2/> <Q/><d1/><d2/>
    <Q/><g2/><h1/> 0-0-0
    <N/><b8/><d7/> <m id="N"><Q/><h1/><g2/>!N</m>
    . . .
The <m> tag is conatins a half moveit may also contain informations on the current ply count, some chess informations (check, capture, castling,...). But of course this representation can be much more compressed so that it reaches at a PGN style like notation:
<body ply-count="40">
  <classification type="ECO" key="B31" keylist="ECOlist"/>
  <moves party="w">
     d2d4  d7d5
     c2c4  c7c6
    Nb1c3  e7e6
     e2e4  d5e4
    Nc3e4 Bf8b4
    Bc1d2 Qd8d4
    Bd2b4 Qd4e4
    Bf1e2  c6c5
    Qe4g2 Qd1d2
    Qg2h1 0-0-0
    Nb8d7 <m id="N">Qh1g2!N</m>
    . . .
But this mixed model of (chess) charater data and tagged areas could make it difficult to parse such a region.

So it seems natural in the context of structured data to use the extended form, which could look like this:
<body ply-count="40">
  <classification type="ECO" key="B31" keylist="ECOlist"/>
  <moves party="w">
    <m ply="1"><d2/><d4/></m>
    <m action="capture"><d5/><e4/></m>
    <m action="capture"><Q/><d8/><d4/></m>
    <m action="capture"><B/><d2/><b4/></m>  
    <m action="capture"><Q/><d4/><e4/></m>
    <m action="capture"><P/><c6/><c5/></m>
    <m action="capture"><Q/><e4/><g2/></m>
    <m action="capture"><Q/><g2/><h1/></m>
    <m action="queensidecastle"><K/><e1/><c1/></m>
    <m id="N"><Q/><h1/><g2/><comment value="!"/><comment value="N"/></m> 
    . . .
The only difficulty is that this representation only works for orthodox chess. But there are further extensions such as the <sq> tag or the <m piece=""> attribute which applies to the case of fairy chess.
The representation of chess moves has a lot of implementations and it is not clear a priori which representation is the best one. Probably we should allow all these representations, because every one is usefull for different applications.
Because it is allowed to use <moves> regions inside anoher <moves> region, it is also very easy to describe variations. If you not only want to use moves for analysis but also text and other useful tags you can use the <analysis> tag which is very similar to <moves> but it is allows more human like comments. Looking at the continuation of the code above this gives
<body ply-count="40">
<classification type="ECO" key="B31" keylist="ECOlist"/>
<moves party="w">
  . . .
  <m action="queensidecastle"><K/><e1/><c1/></m>
  <m id="N"><Q/><h1/><g2/><comment value="!"/><comment value="N"/></m> 
      <text>This is an improvement of</text>
      <moves party="b" type="variation" position="N" id="ref1">
        <m action="capture check"><Q/><h1/><d1/></m>
      <referto reference="ci60game389" refpoint="thisposition"/>
    . . .
For further applications and possibilities see the examples section (coming soon).

Part 2: Chess problems

Source: The whole chess problem as a ChessML file: Example 2.

The <head> section of a chess problem

The only main difference in the <head> of a chess problem and a chess game related file is given by the <definition>. As stated before this section is intended as a place where all the changes of the non orthodox rules and pieces are explained or even only mentioned. A typical definition section may look like the following sniplet:
<definition defclass="piece" example="???" type="hopper" group="hopper" class="no">
 <defname lang="en">
 <defname lang="de">
 <defname lang="fr">
 <inventor src="strategems06">
   Ben Good
 <description src="strategems06" lang="en">
   The Soucie is a Queen-lines hopper whose move length is
   determined by the number of men (incl. the Soucie itself)
   on that particular file, rank or diagonal. 
 <description src="os" lang="de">
   Der Soucie ist eine Grashüpfer, dessen Zugl&amul;nge 
   durch die Anzahl der auf der Linie oder Diagonalen stehenden
   Figuren (inkl. des Soucie selbst) bestimmt ist.
 <description src="os" lang="fr">
   ???? (The French of the author is too weak ...Sorry)
 <rule src="strategems06">
This can be explained as follows: The definition is the definition of a chess piece (deflass="piece"). This piece is a hopper, but is not a class for its own (Soucie is not the class hopper itself...). src="" is an pointer to the source where the definition was taken from. The example is a reference to the particular chess problem which contains this piece.
The next tags <defname>, <inventor>, <rule> and <description src="???" lang="de"> are now the main sections, which give the name the meaning and a rule (such as S1:2 for a knight or R1:1 for a bishop, but the form of a description of a rule which can be used for chess programs is another question...). One tag not mentioned here is the <var> (ie. variation) tag. This contains alternatives to the first definition.
Further hints and practical recipes for the usage of all these tags (inside <definition>) can be found in the files in the example page of this site. As a special example in the next few days a large file of over 200 different chess piece definitions wil be given there. A typical header of a chess problem will like look like
 <head source="Schwalbe">
   <author firstname="Nikita" lastname="Plaksin"/>
   <date year="1986"/>

The <body> section of a chess problem

The body of a chess problem contains
  1. <setup>
  2. <condition>
  3. <stipulation>
  4. <twin>
  5. <soulution>
  6. <theme>

<setup> gives the starting position (a Retro problem by Nikita Plaksin). Here is the compact form given by a fen string:
<fen>S4Lk1/1bbbbbl1/1b6/2d1B1bB/8/1BB2B2/B3BbB1/ts3KTs/ </fen>
But it could also be given by in a much more ellaborate kind as follows:
  <piecelist color="white" notationtype="SAN">
    <piece name="knight"> <a8/> </piece>
    <piece name="bishop"> <e8/> </piece>
    . . . 
  <piecelist color="black" notationtype="SAN">
    <piece name="king"> <f8/> </piece>
    <piece name="pawn"> <b7/> </piece>
    <piece name="pawn"> <c7/> </piece>
    <piece name="pawn"> <d7/> </piece>
    . . .
This form of represention is intended for programs which are not designed for chess games in special.

<condition> describes the modification of chess rules used in the problem:
<condition definition="koeko">Kölner Kontaktschach</condition>
<condition definition="maximummer">Längstzüger</condition>
In this case the definition attribute points to a definition section for example. In the case of a twin we could use the "mode" attribute to distinguish between conditions which are added or removed
  <condition definition="neukoeko" mode="add">Neu-Kölner Kontaktschach</condition>
  <condition definition="koeko" mode>Kölner Kontaktschach</condition>


<stipulation> gives the task which has to bes solved for the particual position.
  <stipulation set="1" tries="4" type="direct" tree="1.*;1.*;1">#3</stipulation> 
The type helps to sort problems by categories, here it is a direct problem. The tree attribute could says that every white move should be unique (=1) and black has an arbitrary choice(="*") to reach the goal. The use of theis attribute is not predefined and makes more sense in the context of help games (such as helpmate or help-stalemate). There is of course an alternative to make more precice the content of a stipulation, for example as <stipulation set="1" tries="4" type="direct" tree="1.*;1.*;1"></#><moves party="w"><number>3</number>. But this form is not yet implemented in Version 0.4.

<twin> is used to allow to define modifications of the problem which themself give a new problem.
  <condition definition="ander" mode="add">Andernach chess</condition>
  <stipulation set="1" type="help">exact-#3</stipulation>


<solution> contains the set play, the tries and the solution of the problem. Here a short example of a well known problem of S.Loyd:
  <position fen="kbK5/pp6/1P6/8/8/8/8/R7"/>
  <stipulation set="1" tries="1">  #2  </stipulation>
  <solution phase="set">
   <co checked="+">
   <m>1...B any</m><m>2.Rxa7#</m>
  <solution phase="try">
   <co checked="+"/>
   <m>1.Ra any ?</m><m> 2.a6!</m>
 <solution phase="authors solution">
   <co checked="+"/>
   <moves party="w" type="mainline" ply="1">
     <moves type="variation" party="b" ply="2"><m>1...B any</m><m>2.Rxa7#</m>


<theme> is an auxillary tag helpful for the systematical description of the logical conent of the chess problem. Applying this to the previous problem of S.Loyd we get
<theme category="logical" id="lp">logical problem</theme>
<theme category="zz" id="szz">schwarzer Zugzwang</theme>
<theme category="sacrifice" id="ws">white sacrifice</theme>

Level 1

Introducing more complicated structures

This section is void, simply because there ars so many directions in which generalizations and refinements of the current status are possible. Probably the section will be filled with real informations when more practical experiences are colected in the work with the Level 0 implementation. Probably a good point would be to improve the structure in that way that ChessML documents are optimmized for quering and provide support for specific databases and chess programs. Especially the idea to improve the queering ability is surely a subtle but also very demanding task for the future and needs more than the simple hacks of the current documents. It is also a very difficult goal to implement the artistic structures of chess problems (such as the consice and general definition of the themes in the context of two-movers). Perhaps it is an idea to use the enriched structures of the xml-schema.


From a pragmatic point of view it is very important to develop working tools, which really help to apply ChessML in the all days work of a chess player or a composer. The quick development of such tools explained the success of PGN. Application useful for ChessML would be
  1. Validation:
    Such tools should be able to check the correctness of a chess score. The main difficulty is to get such a validator for non-standrad ches rules. In the opinion of the author this could only be accomplished by enhacing the mechanism of defintion. The first step could be to implement non-standard pieces by the definition of their rule such as <rule> R1:3 <rule> which means a 1:3-Rider. More difficult pieces such as the Immitator or Siamesic pieces are up to now too difficult to handle (for automatic processing) with the current defintion.
  2. Transformation
    The author and other people involved in the development of ChessML are currently working on transformation tools to and from PGN. Furthermore there are also tools (in an alpha-phase) for transformation from LaTeX to ChessML.
  3. Processing : Developing of processing tools from ChessML files to PDF (using Formatting Objects), LaTeX, RTF and other formats is of course a basic task because this would dramatically simplify the work of the chess publishers. It is also extremely helpfull to get tools for the handling of the main database formats, such as Chessbase, NicBase and other.
  4. Distributing : We need tools working on all the main systems (win32, solaris, linux,...) for handling generating and converting to and from ChessML. In deed there is something on the way...
  5. Storing: In order to reduce file sizes it is an intersting task to think about a compression scheme for ChessML. There are binary compressions for PGN and XML but perhaps in the special case of ChessML we could find improvements of this (ChessML files of game databases are really BIG).
  6. Queering: A goal (far away) would be to invent something like a CQL (chess query language), a standard queering language which helps especially to deal with the most difficult task in chess programming:pattern recognition. A clever solution of this job would be a really big progress for chess games and chess problems.

Other Projects

Some different approaches to the problem are given. The author of this implementation did not study all the details (This can be simply explained by lack of time....). By the way: ChessML should not be compared with the highleveled SGML implementations such as DocBook or MathML. These are long time, full time and very complex projects with tremendously larger scope of applications than any chess implementation.


The Chess Game Markup language A quite complete XML implementation of the PGN Standard and much more; also including Java Classes and Stylesheets.


The Board game markup language implementation. A XML approach to games in general. Not updated since March 1999.


The Caxton XML chess implementation. The only (published) commercial approach to XML and chess; an implementation still in development and no converting-, parsing- and generating-tools are given.


The smart game format; a non-XML standard mainly deveopled for the asian GO game.


The SGF Markup language; a XML implementation of SGF.


The Java program Jago comes with an XML standard deveopled for the asian GO game. Simliar to the situation ChessML vs. PGN this GO XML standard extends SGF.


The Game Markup Language; A markup language for game programming !!

LaTeX styles: chess.sty, diagram.sty

These are high level implemetation of chess games (chess.sty) and chess problems (diagram.sty) in the form of LaTeX style files. By definition they are very practical for processing or desktop publishing but they are not very useful for storing, exchanging and searching.


Status: 1st Working Draft,
Author: Oliver Sick
Date : June 2000