[Mirrored from: http://www.bcs.org.uk/ivanhoe/part-2/g8.htm]
David Penfold MBCS CEng was originally a physicist and moved - through academic journal publishing and running a small company converting disks for typesetting and building databases to produce books - to currently acting as a consultant to organisations such as Helicon Publishing (publishers of the Hutchinson Encyclopaedia), MCB University Press and the Open University. In 1993, he was the project coordinator for the SuperJournal Project, Electronic Journals on SuperJANET. He is currently writing a book on SGML and is closely involved with the Electronic Publishing Specialist Group of the British Computer Society.
Over the last 15 years there has been much publicity over how the newspaper business has changed and introduced computers. What has not been made so public is how this has happened in other areas of publishing as well - and this is not just the introduction of desktop publishing. Several years ago I was involved in converting the text of the Revised English Bible from a proprietary word-processing system so that it could be typeset. The Bible took over a hundred disks and we had to analyse how the coding scheme worked before we could convert it. Today, with the advent of standard generalised markup language (SGML), such basic problems do not have to be solved every time a new job comes in, but there is still scope for the programmer and systems analyst to work on many different problems.
Of course, the world of publishing and printing has commercial systems, just like any other industry, but these are usually operating in a very different world from the databases and programs actually used in the production of publications, today not just on paper but also on magnetic media, such as CD-ROM, or on the World Wide Web. Today's publications are not just restricted to text and graphics, but in a multimedia world also include video and audio. While publishing is becoming less and less easy to define, the influence which technology plays is ever increasing, which means that there is an increasing role for IS specialists.
At the basic end, there is software development, not strictly part of the publishing industry, but having a great influence on its development. While the big companies like Microsoft develop much of their software in the USA, this is not always the case. For example, most of the work on the Oracle text retrieval tools was done in the UK, while one of the leading mathematical typesetting systems, Advent 3B2, was also designed and developed in the UK.
At the next level, there are applications programmers, who take the tools provided by Microsoft and others and write programs either to carry out a specific task, such as converting a text database on the fly from UK to US usage, or customising the way data is presented in a CD-ROM product. It is difficult to imagine all the possible tasks which need to be done, as almost every application requires some new program. At Helicon, the database which forms the basis of the Hutchinson Encyclopaedia and many spin-off products, both on paper and electronic, has been in existence for over eight years and yet new products and the requirements of new media mean that there is a constant need for new programs to check or convert text and other data.
Indeed, one of the most important areas of IS development in publishing today is the use of SGML and generic coding. For years material was keyed in a way which related to the way it would appear on paper. In the 1980s, however, it was realised that holding data in a way which related to its content and structure gave the opportunity not only for a variety of print-on-paper products, but also eventually for electronic products. There are two approaches to this: one is to use a commercial database structure, either a text database or a relational database; the other is to use the concept of document type definition (DTD) developed as part of SGML. A DTD is a formalisation of the hierarchical structure of a document and software is available to parse documents to check that the coding used to delineate the structure actually conforms to the DTD. Document analysis and writing DTDs is a relatively new form of programming, but one which is just as challenging as working with numbers. It can, of course, also be subject-related, for there are projects, such as the text encoding initiative, going ahead at present, both nationally and internationally, in humanities and literary computing, as well as in science and technology.
Another area where there is an increasing need for technical skills is handling colour. Colour is now available at the desktop, but how colour is perceived and represented on screens and printers is properly understood by only a few experts; and it will be necessary to know the effects of compression, resolution, the number of bits and many other factors in order for colour to be reproduced accurately on different devices and transmitted over networks. Just as in handling text, where an appreciation of language does not go amiss, so in this area an appreciation of design and aesthetics can be useful.
Computing with words
It can be argued that analysis of the structure of an early printed book, the conversion of a catalogue from a database to form a printed, or even an electronic, product and the assembly of a manual in such a way that it can be easily used (and just as easily updated) are all different aspects of publishing, if not of typesetting. IS in publishing is thus a field which, potentially, provides opportunities not only for the real 'techies', but also for those who like the idea of 'computing with words' - and increasingly pictures, audio and video as well - and want to combine their professional skills with a wider perspective.