PROJECT ENVISION FINAL REPORT
A User-Centered Database from the Computer Science Literature
NSF Grant IRI-9116991

Edward A. Fox, Lenwood S. Heath, Deborah Hix
Department of Computer Science
Virginia Polytechnic Institute and State University
Blacksburg, VA 24061-0106

Converted to HTML Wed Jul 5 17:41:14 EDT 1995

Summary of Completed Project

 

With the support of the National Science Foundation and the Association for Computing Machinery (ACM), the Envision project has developed a prototype digital library of computer science literature that is highly usable (from user-centered design), highly structured (from SGML and an object database), and highly integrated (from hypertext links among objects). The result is a representation of part of the computer science literature as a cohesive body of knowledge that can be searched and viewed in innovative ways. The user interface was designed with careful attention to user needs and desires (through interviews with potential users), to graphic detail (through involvement of an artist and attention to the research literature on graphical perception and psychophysics), and to usability (through an iterative process of usability evaluation). Recognizing the need to translate enormous quantities of documents in an unlimited variety of input formats into a single standard format, the project developed a flexible system for analyzing the structures (e.g., titles, authors, paragraphs, and references) within a document and translating that structure into any standard markup scheme. The Envision distributed server supports simultaneous access to the library by a number of users and in a variety of ways. The Envision software is soon to be installed at ACM headquarters and made available to ACM members. The Envision system will continue in use at Virginia Tech and Norfolk State University to support the work of a related NSF Educational Infrastructure grant.

Technical Information

 

The list of publications resulting from Envision research appears in the References section. The data collected during this project include electronic versions of computer science literature (Section 2.1). A great deal of software was created or adapted during this project (Section 2.2). A number of people have contributed to the success of the Envision project. These are listed in Appendix A. We are particularly proud of the number of undergraduate students who were able to obtain research experience on the Envision project.

Computer Science Literature

  The library contains bibliographic records, full-text articles, and scanned page images. The bulk of the approximately 100,000 bibliographic records are from ACM's Computing Archive. We have also incorporated publicly available bibliographies from Ohio State University, the University of Arizona, and the University of Melbourne. We have approximately 700 full-text articles from Communications of the ACM and several of the ACM Transactions. Finally, we have about 13,000 scanned page images, from various ACM publications and the technical report series of the Virginia Tech Department of Computer Science.

Envision Software

 

The major software components of the Envision system are the following.

  1.   The Envision Client. This component interacts with a user to accomplish the tasks of querying the Envision library and visualizing result sets in the Envision graphical display. This client interface is a major innovation of the Envision project and required the greatest amount of effort in interaction design and evaluation, in software design, and in software development.
  2.   A WWW Viewer. Envision employs a WWW browser as its presentation front end. Currently we use Mosaic running on a UNIX workstation.
  3.   The Envision Intermediary. This component communicates with the Envision client over the network to maintain session information, packages queries for the MARIAN search system, and packages result sets to pass back to the Envision client.
  4.   The MARIAN Search System. This component, developed in a separate research effort to access a library catalog, searches the Envision library for documents relevant to the user's query. The search can be based on a combination of title, author, and content words. Result sets are ranked by estimated relevance.
  5.   Enhanced WWW Server. Envision documents are viewed via a WWW interface that accesses a WWW server enhanced by CGI scripts that retrieve Envision objects from the object database and package them into HTML for presentation.
  6.   The Object Database. The Envision object database maintains our view of the structure of the library in terms of classes such as document, person (author), institution, publication, and keywords. Objects in this database refer to related objects, providing a rich hypermedia structure.
  7.   The DELTO System. The DELTO (Document Analysis and Translation) system addresses the need to convert documents in many ill-defined input formats that are received for inclusion in the Envision library into the standard SGML structural representation needed by the Envision object database and MARIAN searchers. This system emphasizes flexibility and automation. DELTO is a major innovation of the Envision project.
Components 1 and 2 run under the X Window System; these have been tested on Sun, DECstation, and DEC Alpha workstations. Components 3 and 4 run on a NextStation. Components 5, 6, and 7 run on a DEC Alpha and should port easily to other UNIX systems.

A public release of the Envision software is due during the summer of 1995. The Envision client will be freely available over the Internet by anonymous ftp from Virginia Tech. Initially, the server components (3, 4, 5, 6, and 7) and the actual library of electronic documents will be released to the ACM, as well as used in a related NSF Educational Infrastructure project at Virginia Tech and Norfolk State University.

References

1
G. A. Averboch. A system for document analysis, translation, and automatic hypertext linking. Master's thesis, Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, 1995.

2
S. Betrabet, E. A. Fox, and Q. Chen. A query language for information graphs. Technical Report TR 93-03, Department of Computer Science, Virginia Polytechnic Institute and State University, 1993.

3
D. J. Brueni, B. Cross, E. A. Fox, L. S. Heath, D. Hix, L. T. Nowell, and W. C. Wake. What if there were desktop access to the computer science literature? In Proceedings of the 21st Annual ACM Computer Science Conference, pages 15-22, 1993. Also available as Tech. Report TR 92-42, Department of Computer Science, Virginia Polytechnic Institute and State University.

4
K. Dalal and E. A. Fox. Document translation: Dissertations and technical reports. Technical Report TR 93-31, Department of Computer Science, Virginia Polytechnic Institute and State University, 1993.

5
E. A. Fox. Digital libraries. IEEE Computer, 26(11):79-81, November 1993. ``Hot Topics'' section.

6
E. A. Fox. From information retrieval to networked multimedia information access. In G. Knorz, J. Krause, and C. Womser-Hacker, editors, Information Retrieval '93, Proc. der 1. Tagung Information Retrieval '93, pages 116-124. Univ. of Konstanz Press, 1993. Keynote address, 13-15 September, 1993, University of Regensburg, Germany.

7
E. A. Fox. Sourcebook on digital libraries: Report for the National Science Foundation. Technical Report TR 93-35, Department of Computer Science, Virginia Polytechnic Institute and State University, December 1993. Available by anonymous FTP from directory pub/DigitalLibrary on fox.cs.vt.edu. Over 400 pages.

8
E. A. Fox. A user-centered hypermedia database from the computer science literature. In Proceedings 159th National Meeting of the AAAS, AAAS'93: Science and Engineering for the Future, page 145, 1993. Invited presentation.

9
E. A. Fox. A digital library connecting Envision, KMS, and Mosaic with interfaces, communications, and data interchange. In 1994 Workshop on Digital Libraries: Current Issues, 1994. Invited presentation.

10
E. A. Fox. A digital library connecting Envision, KMS, and Mosaic with interfaces, communications, and data interchange. Invited presentation for 1994 Workshop on Digital Libraries: Current Issues, sponsored by Rutgers and Purdue Universities, AT& and Bellcore. At Rutgers Univ., Newark, NJ, May 19-20, 1994. Abstract in SIGOIS Bulletin, August 1994, 15(1):6., 1994.

11
E. A. Fox. How to make intelligent digital libraries. In Methodologies for Intelligent Systems, Proceedings of the 8th International Symposium, ISMIS '94, volume 869 of Lecture Notes in Artificial Intelligence, pages 27-38. Springer-Verlag, October 1994.

12
E. A. Fox. Seamless multimedia integration for digital libraries. In Dagstuhl Seminar on Fundamentals and Perspectives of Multimedia Systems, pages 118-123, July 1994. Invited position paper.

13
E. A. Fox. World-Wide Web and Computer Science reports. Communications of the ACM, 38(4):43-44, April 1995.

14
E. A. Fox and G. Abdulla. Digital video delivery for a digital library in computer science. In High-Speed Networking and Multimedia Computing Workshop, IS&/SPIE Symposium on Electronic Imaging Science and Technology, February 1994. 7 pages.

15
E. A. Fox, R. M. Akscyn, R. K. Furuta, and J. J. Leggett. Guest editors' introduction to digital libraries. Communications of the ACM, 38(4):22-28, April 1995.

16
E. A. Fox and D. Barnette. Improving education through a computer science digital library with three types of WWW servers. In Proceedings Second International WWW '94: Mosaic and the Web, WWW'94, October 1994.

17
E. A. Fox, N. D. Barnette, C. A. Shaffer, L. S. Heath, W. C. Wake, L. T. Nowell, J. Lee, D. Hix, and H. R. Hartson. Progress in interactive learning with a digital library in computer science. Invited presentation. To appear in Proceedings ED-MEDIA 95, World Conference on Educational Multimedia and Hypermedia, Graz, Austria, 1995.

18
E. A. Fox, Q. F. Chen, A. M. Daoud, and L. S. Heath. Order-preserving minimal perfect hash functions and information retrieval. ACM Transactions on Information Systems, 9(3):281-308, 1991.

19
E. A. Fox, Q. F. Chen, and L. S. Heath. A faster algorithm for constructing minimal perfect hash functions. In Proceedings of the 15th Annual International Conference on Research and Development in Information Retrieval, pages 266-273, 1992.

20
E. A. Fox, Q. F. Chen, and L. S. Heath. LEND and faster algorithms for constructing minimal perfect hash functions. Technical Report TR 92-02, Department of Computer Science, Virginia Polytechnic Institute and State University, 1992.

21
E. A. Fox, R. France, E. Sahle, A. Daoud, and B. Cline. Development of a modern OPAC: From REVTOLC to MARIAN. In Proceedings 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '93, pages 248-259, June-July 1993. Also available as Tech. Report TR 93-06, Department of Computer Science, Virginia Polytechnic Institute and State University.

22
E. A. Fox, L. S. Heath, Q. F. Chen, and A. M. Daoud. Practical minimal perfect hash functions for large databases. Communications of the ACM, 35(1):105-121, January 1992.

23
E. A. Fox, L. S. Heath, and D. Hix. A user-centered database from the computer science literature. In W. W. Chu, A. F. Cardenas, and R. K. Taira, editors, Proceedings of the NSF Scientific Database Projects 1991-1993, pages 70-75, February 1993. AAAS Workshop on Advances in Data Management for the Scientist and Engineer, Boston, Massachusetts.

24
E. A. Fox, D. Hix, L. T. Nowell, D. J. Brueni, W. C. Wake, L. S. Heath, and D. Rao. Users, user interfaces, and objects: Envision, a digital library. Journal of the American Society for Information Science, 44(8):480-491, 1993.

25
E. A. Fox and L. Lunin. Introduction and overview to Perspectives on digital libraries. Journal of the American Society for Information Science, 44(8):441-443, September 1993. Guest editor's introduction to special issue.

26
J. C. French, E. A. Fox, K. Maly, and A. L. Selman. Wide area technical report service: Technical reports online. Communications of the ACM, 38(4):45, April 1995.

27
J. L. Ganley and L. S. Heath. Local search for the retrieval layout problem. Submitted to OR Spektrum. Also available as Tech. Report TR 93-28, Department of Computer Science, Virginia Polytechnic Institute and State University, 1993.

28
J. L. Ganley and L. S. Heath. Heuristics for laying out information graphs. Computing, 52:389-405, 1994. Also available as Tech. Report TR 93-27, Department of Computer Science, Virginia Polytechnic Institute and State University.

29
J. L. Ganley and L. S. Heath. Optimal and random partitions of random graphs. The Computer Journal, 37:641-643, 1994. Also available as Tech. Report TR 93-24, Department of Computer Science, Virginia Polytechnic Institute and State University.

30
H. Gladney, Z. Ahmed, R. Ashany, N. Belkin, E. A. Fox, and M. Zemankova. Digital library: Gross structure and requirements (report from a workshop). Technical Report TR 94-25, Department of Computer Science, Virginia Polytechnic Institute and State University, 1994. Revised version accepted for publication in Electronic Publishing: Origination, Dissemination, and Design.

31
H. Gladney, E. A. Fox, Z. Ahmed, R. Ashany, N. Belkin, and M. Zemankova. Digital library: Gross structure and requirements: Report from a March 1994 workshop. In J. Schnase, J. Leggett, R. Furuta, and T. Metcalfe, editors, Digital Libraries '94, pages 101-107, 1994.

32
L. S. Heath, D. Hix, L. T. Nowell, W. C. Wake, G. A. Averboch, E. Labow, S. A. Guyer, D. J. Brueni, R. K. France, K. Dalal, and E. A. Fox. Envision: A user-centered database of computer science literature. Communications of the ACM, 38(4):52-53, April 1995.

33
J. W. Lavinus. Heuristics for laying out information graphs. Master's thesis, Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, August 1992.

34
J. Leggett, J. Schnase, J. Smith, and E. A. Fox. Final report of the NSF workshop on hyperbase systems. Department of Computer Science, Texas A& University, Hypermedia Research Lab Report TAMU-HRL 93-002, July 1993. For workshop on October 15-16, 1992, in Washington, D.C., 1993.

35
K. Maly, E. A. Fox, J. J. French, and A. L. Selman. Wide area technical report service. Technical Report TR 92-44, Department of Computer Science, Old Dominion University, 1992.

36
K. Maly, J. J. French, A. L. Selman, and E. A. Fox. The wide area technical report service. In Proceedings Second International WWW '94: Mosaic and the Web, WWW'94, pages 523-533, October 1994. Also available as Tech. Report TR 94-13, Department of Computer Science, Old Dominion University.

37
L. T. Nowell. Psychophysical Foundations of Information Visualization for the Envision Digital Library. PhD thesis, Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, 1996. In progress.

38
L. T. Nowell, E. A. Fox, L. S. Heath, D. Hix, W. C. Wake, and E. A. Labow. Seeing things your way: Information visualization for a user-centered database of computer science literature. Technical Report TR 94-06, Department of Computer Science, Virginia Polytechnic Institute and State University, 1994.

39
L. T. Nowell and D. Hix. User interface design for the project Envision database of computer science literature. In Proceedings of the Twenty-second Annual Virginia Computer Users Conference, pages 29-33, 1992.

40
L. T. Nowell and D. Hix. Query composition: Why does it have to be so hard? In Proceedings of the East-West International Conference on Human-Computer Interaction, volume I, pages 226-241, 1993. Moscow, Russia, August, 1993. Also available as Tech. Report TR 93-19, Department of Computer Science, Virginia Polytechnic Institute and State University.

41
L. T. Nowell and D. Hix. Visualizing search results: User interface development for the project Envision database of computer science literature. In Human-Computer Interaction: Software and Hardware Interfaces, volume 19B of Advances in Human Factors/Ergonomics, Proceedings of HCI International '93, 5th International Conference on Human Computer Interaction, pages 56-61. Elsevier, 1993.

42
W. C. Wake. Account number: A pattern. In J. O. Coplien and D. C. Schmidt, editors, Pattern Languages of Programs. Addison-Wesley, Reading, MA, May 1995.

43
W. C. Wake. A Model and Interface for Documents with Multiple Views. PhD thesis, Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, 1995. In progress.

Envision Researchers

To be inserted when latex2html can manage the table.



Edward A. Fox
Wed Jul 5 17:41:08 EDT 1995