SGML: BNC FREQUENTLY ASKED QUESTIONS

SGML: BNC FREQUENTLY ASKED QUESTIONS




           -----------------------------------------------
           B r i t i s h    N a t i o n a l    C o r p u s
           -----------------------------------------------
          
                   FREQUENTLY ASKED QUESTIONS
                      last update: 3 July 95

                              ***

Q. What's in the BNC?

A. Extracts from 4124 modern British English texts of all kinds, both
   spoken and written.  Each text is segmented into orthographic sentence
   units, and each word automatically assigned a part of speech code. There
   are 6 and a quarter million sentences, and over 100 million words.

Q. Where did it come from?

A. It was produced by an consortium of leading dictionary publishers
   (OUP, Longman, Chambers-Harrap) and  academic research centres (Oxford
   University Computing Services, Unit for Computer Research in the
   English Language at Lancaster University, British Library Research and
   Development)

Q. What use is it?

A. It provides a unique and authoritative view of the state of the English
   language today, with carefully balanced representation of as many
   different varieties of English as possible. It can be used to 
   exercise NLP systems of all kinds, as a fertile source of real life
   examples for language learners, or simply to explore the way the
   language is currently used.


Q. What do I have to do to use it?

A. If you want to use the corpus solely for purposes of academic
   research, all you have to do is agree to the terms of the licence. If
   you want to use it for other purposes, we will refer your request to the
   BNC Consortium, who will discuss licensing arrangements with you. 


Q. How much does it cost?

A. BNC Release 1.0 costs 220 pounds, exclusive of VAT. This
   includes 10 GBP for a BNC licence, valid for 5 years.

Q. What do I get for my money?

A. The first release of the BNC  comprises:
   -- the full text of the 100 million word corpus 
   -- printed and online documentation 
   -- a full word index to the whole corpus 
   -- ANSI C source code for the SARA server program and for a simple
      SARA client program
   packaged as 3 CD roms.

Q. What kind of computer system will I need to use it?

A. You can unpack the distribution CDs on any Unix system capable of
   reading ISO 9660 format. The corpus texts alone occupy nearly
   2 Gb unpacked. The SARA index occupies a further 2 Gb.
   The BNC is an SGML document complying with ISO 8879.

Q. How can I order a copy?

A. You will need to get a copy of the order form and two copies of the
   licence. You can download these from our Web site or request them
   from the address below.
   
Q. What are the licensing conditions?

A. The licence says you can use the corpus for any non-commercial
   purposes, subject to the "fair-dealing" provisions of the Copyright
   Act. At present, you must be located in a member state of the
   EU. There are also a number of other conditions designed to protect the
   owners of IPR in the corpus contents and the interests of the
   commercial partners in the BNC Consortium.

Q. Is it available online?

A. Not yet. We have been running an experimental online service
   for some months, but the software is not yet ready for release.
   Watch this space for further announcements!


--------------------------------------------------------

British National Corpus
Oxford University Computing Services
13 Banbury Road
Oxford OX2 6NN

http://info.ox.ac.uk/bnc

tel +44 (1865) 273 280    
fax +44 (1865) 273 275

natcorp@oucs.ox.ac.uk
   
----------------------------------------------------------