PROJECT ENVISION FINAL REPORT
A User-Centered Database from the Computer Science Literature
NSF Grant IRI-9116991
Edward A. Fox, Lenwood S. Heath, Deborah Hix
Department of Computer Science
Virginia Polytechnic Institute and State University
Blacksburg, VA 24061-0106
Converted to HTML Wed Jul 5 17:41:14 EDT 1995
With the support of the National Science Foundation
and the Association for Computing Machinery (ACM),
the Envision project has developed a prototype digital library
of computer science literature
that is highly usable (from user-centered design),
highly structured (from SGML and an object database),
and highly integrated (from hypertext links among objects).
The result is a representation of part of the computer science literature
as a cohesive body of knowledge
that can be searched and viewed in innovative ways.
The user interface was designed with careful
attention to user needs and desires
(through interviews with potential users),
to graphic detail
(through involvement of an artist
and attention to the research literature
on graphical perception and psychophysics),
and
to usability
(through an iterative process of usability evaluation).
Recognizing the need to translate enormous quantities of documents
in an unlimited variety of input formats into a single standard format,
the project developed a flexible system for analyzing the structures
(e.g., titles, authors, paragraphs, and references)
within a document and translating
that structure into any standard markup scheme.
The Envision distributed server supports simultaneous
access to the library by a number of users and in a variety of ways.
The Envision software is soon to be installed at ACM headquarters
and made available to ACM members.
The Envision system will continue in use at Virginia Tech
and Norfolk State University
to support the work of a related NSF Educational Infrastructure grant.
The list of publications resulting from Envision research
appears in the References section.
The data collected during this project
include electronic versions of computer science literature
(Section 2.1).
A great deal of software was created or adapted
during this project
(Section 2.2).
A number of people have contributed to the success
of the Envision project.
These are listed in Appendix A.
We are particularly proud of the number of undergraduate students
who were able to obtain research experience on the Envision project.
The library contains bibliographic records,
full-text articles,
and scanned page images.
The bulk of the approximately 100,000 bibliographic records are
from ACM's Computing Archive.
We have
also incorporated publicly available bibliographies from Ohio State University,
the University of Arizona,
and the University of Melbourne.
We have approximately 700 full-text articles
from Communications of the ACM
and several of the ACM Transactions.
Finally,
we have about 13,000 scanned page images,
from various ACM publications and the technical report series of the
Virginia Tech Department of Computer Science.
The major software components of the Envision system
are the following.
-
The Envision Client.
This component interacts with a user to accomplish the tasks of querying
the Envision library
and visualizing result sets in the Envision graphical display.
This client interface is a major innovation of the Envision project
and required the greatest amount of effort
in interaction design and evaluation, in software design,
and in software development.
-
A WWW Viewer.
Envision employs a WWW browser
as its presentation front end.
Currently we use Mosaic running on a UNIX workstation.
-
The Envision Intermediary.
This component communicates with the Envision client
over the network
to maintain session information,
packages queries for the MARIAN search system,
and packages result sets to pass back to the Envision client.
-
The MARIAN Search System.
This component,
developed in a separate research effort to access a library catalog,
searches the Envision library
for documents relevant to the user's query.
The search can be based on a combination of title, author,
and content words.
Result sets are ranked by estimated relevance.
-
Enhanced WWW Server.
Envision documents are viewed via a WWW interface
that accesses a WWW server
enhanced by CGI scripts that retrieve Envision objects from
the object database and package them into HTML for presentation.
-
The Object Database.
The Envision object database maintains our view of the structure
of the library
in terms of classes such as document, person (author), institution,
publication, and keywords.
Objects in this database refer to related objects,
providing a rich hypermedia structure.
-
The DELTO System.
The DELTO (Document Analysis and Translation) system
addresses the need to convert documents in many ill-defined input formats
that are received for inclusion in the Envision library
into the standard SGML structural representation
needed by the Envision object database and MARIAN searchers.
This system emphasizes flexibility and automation.
DELTO is a major innovation of the Envision project.
Components 1 and 2 run under the X Window System;
these have been tested on Sun, DECstation, and DEC Alpha workstations.
Components 3 and 4
run on a NextStation.
Components 5, 6, and 7
run on a DEC Alpha and should port easily to other UNIX systems.
A public release of the Envision software
is due during the summer of 1995.
The Envision client will be freely available
over the Internet by anonymous ftp from Virginia Tech.
Initially,
the server components
(3, 4, 5, 6, and 7)
and
the actual library of electronic documents
will be released to the ACM,
as well as used in a related NSF Educational Infrastructure project
at Virginia Tech and Norfolk State University.
References
- 1
-
G. A. Averboch.
A system for document analysis, translation, and automatic hypertext
linking.
Master's thesis, Department of Computer Science, Virginia Polytechnic
Institute and State University, Blacksburg, Virginia, 1995.
- 2
-
S. Betrabet, E. A. Fox, and Q. Chen.
A query language for information graphs.
Technical Report TR 93-03, Department of Computer Science, Virginia
Polytechnic Institute and State University, 1993.
- 3
-
D. J. Brueni, B. Cross, E. A. Fox, L. S. Heath, D. Hix, L. T. Nowell, and W. C.
Wake.
What if there were desktop access to the computer science literature?
In Proceedings of the 21st Annual ACM Computer Science
Conference, pages 15-22, 1993.
Also available as Tech. Report TR 92-42, Department of Computer
Science, Virginia Polytechnic Institute and State University.
- 4
-
K. Dalal and E. A. Fox.
Document translation: Dissertations and technical reports.
Technical Report TR 93-31, Department of Computer Science, Virginia
Polytechnic Institute and State University, 1993.
- 5
-
E. A. Fox.
Digital libraries.
IEEE Computer, 26(11):79-81, November 1993.
``Hot Topics'' section.
- 6
-
E. A. Fox.
From information retrieval to networked multimedia information
access.
In G. Knorz, J. Krause, and C. Womser-Hacker, editors, Information Retrieval '93, Proc. der 1. Tagung Information Retrieval '93,
pages 116-124. Univ. of Konstanz Press, 1993.
Keynote address, 13-15 September, 1993, University of Regensburg,
Germany.
- 7
-
E. A. Fox.
Sourcebook on digital libraries: Report for the National Science
Foundation.
Technical Report TR 93-35, Department of Computer Science, Virginia
Polytechnic Institute and State University, December 1993.
Available by anonymous FTP from directory pub/DigitalLibrary on
fox.cs.vt.edu. Over 400 pages.
- 8
-
E. A. Fox.
A user-centered hypermedia database from the computer science
literature.
In Proceedings 159th National Meeting of the AAAS, AAAS'93:
Science and Engineering for the Future, page 145, 1993.
Invited presentation.
- 9
-
E. A. Fox.
A digital library connecting Envision, KMS, and Mosaic with
interfaces, communications, and data interchange.
In 1994 Workshop on Digital Libraries: Current Issues, 1994.
Invited presentation.
- 10
-
E. A. Fox.
A digital library connecting Envision, KMS, and Mosaic with
interfaces, communications, and data interchange.
Invited presentation for 1994 Workshop on Digital Libraries: Current
Issues, sponsored by Rutgers and Purdue Universities, AT& and Bellcore. At
Rutgers Univ., Newark, NJ, May 19-20, 1994. Abstract in SIGOIS Bulletin,
August 1994, 15(1):6., 1994.
- 11
-
E. A. Fox.
How to make intelligent digital libraries.
In Methodologies for Intelligent Systems, Proceedings of the 8th
International Symposium, ISMIS '94, volume 869 of Lecture Notes in
Artificial Intelligence, pages 27-38. Springer-Verlag, October 1994.
- 12
-
E. A. Fox.
Seamless multimedia integration for digital libraries.
In Dagstuhl Seminar on Fundamentals and Perspectives of
Multimedia Systems, pages 118-123, July 1994.
Invited position paper.
- 13
-
E. A. Fox.
World-Wide Web and Computer Science reports.
Communications of the ACM, 38(4):43-44, April 1995.
- 14
-
E. A. Fox and G. Abdulla.
Digital video delivery for a digital library in computer science.
In High-Speed Networking and Multimedia Computing Workshop,
IS&/SPIE Symposium on Electronic Imaging Science and Technology, February
1994.
7 pages.
- 15
-
E. A. Fox, R. M. Akscyn, R. K. Furuta, and J. J. Leggett.
Guest editors' introduction to digital libraries.
Communications of the ACM, 38(4):22-28, April 1995.
- 16
-
E. A. Fox and D. Barnette.
Improving education through a computer science digital library with
three types of WWW servers.
In Proceedings Second International WWW '94: Mosaic and the
Web, WWW'94, October 1994.
- 17
-
E. A. Fox, N. D. Barnette, C. A. Shaffer, L. S. Heath, W. C. Wake, L. T.
Nowell, J. Lee, D. Hix, and H. R. Hartson.
Progress in interactive learning with a digital library in computer
science.
Invited presentation. To appear in Proceedings ED-MEDIA 95, World
Conference on Educational Multimedia and Hypermedia, Graz, Austria, 1995.
- 18
-
E. A. Fox, Q. F. Chen, A. M. Daoud, and L. S. Heath.
Order-preserving minimal perfect hash functions and information
retrieval.
ACM Transactions on Information Systems, 9(3):281-308, 1991.
- 19
-
E. A. Fox, Q. F. Chen, and L. S. Heath.
A faster algorithm for constructing minimal perfect hash functions.
In Proceedings of the 15th Annual International Conference on
Research and Development in Information Retrieval, pages 266-273, 1992.
- 20
-
E. A. Fox, Q. F. Chen, and L. S. Heath.
LEND and faster algorithms for constructing minimal perfect hash
functions.
Technical Report TR 92-02, Department of Computer Science, Virginia
Polytechnic Institute and State University, 1992.
- 21
-
E. A. Fox, R. France, E. Sahle, A. Daoud, and B. Cline.
Development of a modern OPAC: From REVTOLC to MARIAN.
In Proceedings 16th Annual International ACM SIGIR Conference on
Research and Development in Information Retrieval, SIGIR '93, pages
248-259, June-July 1993.
Also available as Tech. Report TR 93-06, Department of Computer
Science, Virginia Polytechnic Institute and State University.
- 22
-
E. A. Fox, L. S. Heath, Q. F. Chen, and A. M. Daoud.
Practical minimal perfect hash functions for large databases.
Communications of the ACM, 35(1):105-121, January 1992.
- 23
-
E. A. Fox, L. S. Heath, and D. Hix.
A user-centered database from the computer science literature.
In W. W. Chu, A. F. Cardenas, and R. K. Taira, editors, Proceedings of the NSF Scientific Database Projects 1991-1993, pages
70-75, February 1993.
AAAS Workshop on Advances in Data Management for the Scientist and
Engineer, Boston, Massachusetts.
- 24
-
E. A. Fox, D. Hix, L. T. Nowell, D. J. Brueni, W. C. Wake, L. S. Heath, and
D. Rao.
Users, user interfaces, and objects: Envision, a digital library.
Journal of the American Society for Information Science,
44(8):480-491, 1993.
- 25
-
E. A. Fox and L. Lunin.
Introduction and overview to Perspectives on digital libraries.
Journal of the American Society for Information Science,
44(8):441-443, September 1993.
Guest editor's introduction to special issue.
- 26
-
J. C. French, E. A. Fox, K. Maly, and A. L. Selman.
Wide area technical report service: Technical reports online.
Communications of the ACM, 38(4):45, April 1995.
- 27
-
J. L. Ganley and L. S. Heath.
Local search for the retrieval layout problem.
Submitted to OR Spektrum. Also available as Tech. Report TR 93-28,
Department of Computer Science, Virginia Polytechnic Institute and State
University, 1993.
- 28
-
J. L. Ganley and L. S. Heath.
Heuristics for laying out information graphs.
Computing, 52:389-405, 1994.
Also available as Tech. Report TR 93-27, Department of Computer
Science, Virginia Polytechnic Institute and State University.
- 29
-
J. L. Ganley and L. S. Heath.
Optimal and random partitions of random graphs.
The Computer Journal, 37:641-643, 1994.
Also available as Tech. Report TR 93-24, Department of Computer
Science, Virginia Polytechnic Institute and State University.
- 30
-
H. Gladney, Z. Ahmed, R. Ashany, N. Belkin, E. A. Fox, and M. Zemankova.
Digital library: Gross structure and requirements (report from a
workshop).
Technical Report TR 94-25, Department of Computer Science, Virginia
Polytechnic Institute and State University, 1994.
Revised version accepted for publication in Electronic Publishing:
Origination, Dissemination, and Design.
- 31
-
H. Gladney, E. A. Fox, Z. Ahmed, R. Ashany, N. Belkin, and M. Zemankova.
Digital library: Gross structure and requirements: Report from a
March 1994 workshop.
In J. Schnase, J. Leggett, R. Furuta, and T. Metcalfe, editors, Digital Libraries '94, pages 101-107, 1994.
- 32
-
L. S. Heath, D. Hix, L. T. Nowell, W. C. Wake, G. A. Averboch, E. Labow, S. A.
Guyer, D. J. Brueni, R. K. France, K. Dalal, and E. A. Fox.
Envision: A user-centered database of computer science literature.
Communications of the ACM, 38(4):52-53, April 1995.
- 33
-
J. W. Lavinus.
Heuristics for laying out information graphs.
Master's thesis, Department of Computer Science, Virginia Polytechnic
Institute and State University, Blacksburg, Virginia, August 1992.
- 34
-
J. Leggett, J. Schnase, J. Smith, and E. A. Fox.
Final report of the NSF workshop on hyperbase systems.
Department of Computer Science, Texas A& University, Hypermedia
Research Lab Report TAMU-HRL 93-002, July 1993. For workshop on October
15-16, 1992, in Washington, D.C., 1993.
- 35
-
K. Maly, E. A. Fox, J. J. French, and A. L. Selman.
Wide area technical report service.
Technical Report TR 92-44, Department of Computer Science, Old
Dominion University, 1992.
- 36
-
K. Maly, J. J. French, A. L. Selman, and E. A. Fox.
The wide area technical report service.
In Proceedings Second International WWW '94: Mosaic and the
Web, WWW'94, pages 523-533, October 1994.
Also available as Tech. Report TR 94-13, Department of Computer
Science, Old Dominion University.
- 37
-
L. T. Nowell.
Psychophysical Foundations of Information Visualization for the
Envision Digital Library.
PhD thesis, Department of Computer Science, Virginia Polytechnic
Institute and State University, Blacksburg, Virginia, 1996.
In progress.
- 38
-
L. T. Nowell, E. A. Fox, L. S. Heath, D. Hix, W. C. Wake, and E. A. Labow.
Seeing things your way: Information visualization for a user-centered
database of computer science literature.
Technical Report TR 94-06, Department of Computer Science, Virginia
Polytechnic Institute and State University, 1994.
- 39
-
L. T. Nowell and D. Hix.
User interface design for the project Envision database of computer
science literature.
In Proceedings of the Twenty-second Annual Virginia Computer
Users Conference, pages 29-33, 1992.
- 40
-
L. T. Nowell and D. Hix.
Query composition: Why does it have to be so hard?
In Proceedings of the East-West International Conference on
Human-Computer Interaction, volume I, pages 226-241, 1993.
Moscow, Russia, August, 1993. Also available as Tech. Report
TR 93-19, Department of Computer Science, Virginia Polytechnic Institute and
State University.
- 41
-
L. T. Nowell and D. Hix.
Visualizing search results: User interface development for the
project Envision database of computer science literature.
In Human-Computer Interaction: Software and Hardware
Interfaces, volume 19B of Advances in Human Factors/Ergonomics,
Proceedings of HCI International '93, 5th International Conference on Human
Computer Interaction, pages 56-61. Elsevier, 1993.
- 42
-
W. C. Wake.
Account number: A pattern.
In J. O. Coplien and D. C. Schmidt, editors, Pattern Languages
of Programs. Addison-Wesley, Reading, MA, May 1995.
- 43
-
W. C. Wake.
A Model and Interface for Documents with Multiple Views.
PhD thesis, Department of Computer Science, Virginia Polytechnic
Institute and State University, Blacksburg, Virginia, 1995.
In progress.
To be inserted when latex2html can manage the table.
Edward A. Fox
Wed Jul 5 17:41:08 EDT 1995