+ Page 1 + ----------------------------------------------------------------- The Public-Access Computer Systems Review Volume 5, Number 6 (1994) ISSN 1048-6542 ----------------------------------------------------------------- To retrieve an article file as an e-mail message, send the GET command given after the article information to listserv@uhupvm1.uh.edu. (Files are also available from the University of Houston Libraries' Gopher server: info.lib.uh.edu, port 70.) CONTENTS COMMUNICATIONS The World-Wide Web and Mosaic: An Overview for Librarians By Eric Lease Morgan (pp. 5-26) To retrieve this file: GET MORGAN PRV5N6 F=MAIL URL: gopher://info.lib.uh.edu:70/00/articles/ e-journals/uhlibrary/pacsreview/v5/n6/morgan.5n6 This paper overviews the World-Wide Web (frequently abbreviated as the "Web") and related systems and standards. First, it introduces Web concepts and tools and describes how they fit together to form a coherent whole, including the client/server model of computing, the Uniform Resource Locator (URL), selected Web client and server programs, the HyperText Transfer Protocol (HTTP), the HyperText Markup Language (HTML), selected HTML converters and editors, and Common Gateway Interface (CGI) scripts. Second, it discusses strategies for organizing Web information. Finally, it advocates the direct involvement of librarians in the development of Web information resources. COLUMNS Public-Access Provocations: An Informal Column And Only Half of What You See, Part III: I Heard It Through the Internet By Walt Crawford (pp. 27-30) To retrieve this file: GET CRAWFORD PRV5N6 F=MAIL URL: gopher://info.lib.uh.edu:70/00/articles/ e-journals/uhlibrary/pacsreview/v5/n6/crawford.5n6 + Page 2 + ----------------------------------------------------------------- The Public-Access Computer Systems Review ----------------------------------------------------------------- Editor-in-Chief Charles W. Bailey, Jr. University Libraries University of Houston Houston, TX 77204-2091 (713) 743-9804 Internet: lib3@uhupvm1.uh.edu Associate Editors Columns: Leslie Pearse, OCLC Communications: Dana Rooks, University of Houston Editorial Board Ralph Alberico, University of Texas, Austin George H. Brett II, Clearinghouse for Networked Information Discovery and Retrieval Priscilla Caplan, University of Chicago Steve Cisler, Apple Computer, Inc. Walt Crawford, Research Libraries Group Lorcan Dempsey, University of Bath Pat Ensor, University of Houston Nancy Evans, Pennsylvania State University, Ogontz Charles Hildreth, University of Oklahoma Ronald Larsen, University of Maryland Clifford Lynch, Division of Library Automation, University of California David R. McDonald, Tufts University R. Bruce Miller, University of California, San Diego Paul Evan Peters, Coalition for Networked Information Mike Ridley, University of Waterloo Peggy Seiden, Skidmore College Peter Stone, University of Sussex John E. Ulmschneider, North Carolina State University + Page 3 + Technical Support Tahereh Jafari, University of Houston Publication Information Published on an irregular basis by the University Libraries, University of Houston. Technical support is provided by the Information Technology Division, University of Houston. Circulation: 8,372 subscribers in 65 countries (PACS-L) and 2,711 subscribers in 52 countries (PACS-P). Back issues are available from listserv@uhupvm1.uh.edu. To retrieve a cumulative index to the journal, send the following e- mail message to the list server: GET INDEX PR F=MAIL. Back issues are also available from the University of Houston Libraries' Gopher server. Point your Gopher client at info.lib.uh.edu, port 70, and follow this menu path: Looking for Articles Electronic Journals E-Journals Published by the University of Houston Libraries The Public-Access Computer Systems Review The journal's URL is gopher://info.lib.uh.edu:70/11/articles/e- journals/uhlibrary/pacsreview. The first three volumes of The Public-Access Computer Systems Review are also available in book form from the American Library Association's Library and Information Technology Association (LITA). (Volume four is forthcoming.) The price of each volume is $17 for LITA members and $20 for non-LITA members. All three volumes can be ordered as a set for $45 (indicate that you want the PACS Review set, order number 7712-X). To order, contact: ALA Publishing Services, Order Department, 50 East Huron Street, Chicago, IL 60611-2729, (800) 545-2433. + Page 4 + ----------------------------------------------------------------- The Public-Access Computer Systems Review is an electronic journal that is distributed on the Internet and on other computer networks. There is no subscription fee. To subscribe, send an e-mail message to listserv@uhupvm1.uh.edu that says: SUBSCRIBE PACS-P First Name Last Name. The Public-Access Computer Systems Review is Copyright (C) 1994 by the University Libraries, University of Houston. All Rights Reserved. Copying is permitted for noncommercial use by academic computer centers, computer conferences, individual scholars, and libraries. Libraries are authorized to add the journal to their collection, in electronic or printed form, at no charge. This message must appear on all copied material. All commercial use requires permission. ----------------------------------------------------------------- + Page 27 + ----------------------------------------------------------------- Public-Access Provocations: An Informal Column ----------------------------------------------------------------- ----------------------------------------------------------------- Crawford, Walt. "And Only Half of What You See, Part III: I Heard It Through the Internet." The Public-Access Computer Systems Review 5, no. 6 (1994): 27-30. To retrieve this file, send the following e-mail message to listserv@uhupvm1.uh.edu: GET CRAWFORD PRV5N6 F=MAIL. Or, use the following URL: gopher:// info.lib.uh.edu:70/00/articles/e-journals/uhlibrary/pacsreview/ v5/n6/crawford.5n6. ----------------------------------------------------------------- Effective public access requires skeptical users, a point that the previous two Public-Access Provocations tried to make indirectly. Just because something comes from "the computer," there is no reason to believe that it's correct--and, although library cataloging represents one of the treasures of the profession, catalogs aren't always completely trustworthy either. But at least library catalogs represent sincere efforts to provide useful, validated, even authority-controlled information. Similarly, although commercial online databases are rife with typos and other errors, it is still true that the databases available on Eureka, FirstSearch, Dialog and the like represent reasonable attempts to organize data into useful information with good levels of correctness. Then there's the Internet, the nascent Information Superhighway according to some, where everything's up to date and the hottest information is available by clicking away at Mosaic or using WAIS to find out everything you could ever want to know, magically arranged so that the first thing you get is the most useful! And, with disintermediation and direct usage from every home (and a cardboard box under the freeway?), tomorrow's super-Internet will offer this wonderland to everyone, all the time, making everyone potentially an up-to-date expert on whatever. Skeptical? Why? It's hot, it's happening, it's now--it's on the Internet! + Page 28 + Seventy Elements: More Than Enough! Thus we can expect to have fledgling scientists learning the new and improved seventy-element periodic table with innovative new element symbols. It must be right--it's on the Internet. I could go on with hundreds of examples; as one version of that famous cartoon goes, "On the Internet, nobody knows you're a fraud." Of course, truly up-to-date users may be wary of something that's just boring old ASCII. If they can't chew up bandwidth with neat color pictures or (preferably) important live video--such as vital visual information on how the coffee maker at some university lab is doing right now--why would they want to be bothered? The newest and most correct information will all be graphical, accessed through Mosaic or some replacement. Traditionally, well-done presentations have added weight to content: there was an assumption that anyone with the resources to do high-quality graphics and good text layout would probably pay attention to the content. That was never a good assumption, of course, but at least it separated well-funded frauds from casual cranks and those who simply couldn't be bothered to check their facts. That's all changed. It doesn't take much to build truly impressive World-Wide Web servers. Anyone with an Internet connection and a decent graphics toolkit can create pages just as impressive as anything from the Library of Congress or NASA--but without any regard for factuality or meaning. You don't even need good taste to build impressive presentations; modern software will provide professional defaults so that you just add your erroneous or misleading text and graphics. Knowing the Source The anarchic nature of the Internet and the leveling effect of today's software raises the importance of cultivating appropriate skepticism among users, which must begin with appropriate skepticism among librarians and other library staff. For starters, Internet searchers must be trained to look for (and understand) the source of stuff that comes over the Net, but they must also learn to go beyond simple source awareness. + Page 29 + Some Internet navigation tools tend to mask sources, and that can be dangerous. There are thousands of cranks on the Internet now, and there will be even more in the future. Given a few thousand dollars and a few weeks of time, I could prepare a Library of Regress server that could be seen as a serious competitor to the Library of Congress--never mind that everything at the Library of Regress was at least half wrong, or at best meaningless. A neo-Marxist crank could create an impressive news bureau and be taken quite as seriously as a major news agency, even if that crank made up the supposed news flashes and wildly misinterpreted real events. A few MIT students with good software could provide a steady stream of Rubble Telescope (or Hobbled Telescope?) discoveries based on creatively modified clip art--and they would probably even have a ".mit.edu" suffix, assuring credibility. (To the best of my knowledge, all of these examples are hypothetical. I use MIT as an example because of its reputation for ingenious pranks.) What's the solution? Certainly not to restrict Internet access to a few hallowed and licensed information providers. That would be even more dangerous to our society than having huge gobs of erroneous material on the Net and is, I believe, an impossibility as things stand. Rather, if there is a solution, it is to inculcate caution and healthy skepticism among users of the Internet and other immediate resources: to make them understand that being online and apparently up-to-date confers no authority or even probability of correctness on the information they see. One way to start may be to use a different name for the Internet. It's not the Information Superhighway; it's the Stuff Swamp. There is a lot of good stuff out there, to be sure--but it's still a swamp, and a heavily polluted one at that. Wear your hip boots when you go out on the Internet; the stuff can get pretty thick at times. About the Author Walt Crawford, Senior Analyst, The Research Libraries Group, Inc., 1200 Villa Street, Mountain View, CA 94041-1100. Internet: br.wcc@rlg.stanford.edu. + Page 30 + ----------------------------------------------------------------- The Public-Access Computer Systems Review is an electronic journal that is distributed on the Internet and on other computer networks. There is no subscription fee. To subscribe, send an e-mail message to listserv@uhupvm1.uh.edu that says: SUBSCRIBE PACS-P First Name Last Name. This article is Copyright (C) 1994 by Walt Crawford. All Rights Reserved. The Public-Access Computer Systems Review is Copyright (C) 1994 by the University Libraries, University of Houston. All Rights Reserved. Copying is permitted for noncommercial use by academic computer centers, computer conferences, individual scholars, and libraries. Libraries are authorized to add the journal to their collection, in electronic or printed form, at no charge. This message must appear on all copied material. All commercial use requires permission. ----------------------------------------------------------------- + Page 5 + ----------------------------------------------------------------- Morgan, Eric Lease. "The World-Wide Web and Mosaic: An Overview for Librarians." The Public-Access Computer Systems Review 5, no. 6 (1994): 5-26. To retrieve this file, send the following e- mail message to listserv@uhupvm1.uh.edu: GET MORGAN PRV5N6 F=MAIL. Or, use the following URL: gopher://info.lib.uh.edu:70/ 00/articles/e-journals/uhlibrary/pacsreview/v5/n6/morgan.5n6. ----------------------------------------------------------------- 1.0 Introduction The WorldWideWeb (W3) is the universe of network-accessible information, an embodiment of human knowledge. It is an initiative started at CERN, now with many participants. It has a body of software, and a set of protocols and conventions. W3 uses hypertext and multimedia techniques to make the web easy for anyone to roam, browse, and contribute to. [1] This paper overviews the World-Wide Web (frequently abbreviated as "W3," "WWW," or the "Web") and related systems and standards. [2] First, it introduces Web concepts and tools and describes how they fit together to form a coherent whole, including the client/server model of computing, the Uniform Resource Locator (URL), selected Web client and server programs, the HyperText Transfer Protocol (HTTP), the HyperText Markup Language (HTML), selected HTML converters and editors, and Common Gateway Interface (CGI) scripts. Second, it discusses strategies for organizing Web information. Finally, it advocates the direct involvement of librarians in the development of Web information resources. 2.0 Background In 1989, Tim Berners-Lee of CERN (a particle physics laboratory in Geneva, Switzerland) began work on the World-Wide Web. The Web was initially intended as a way to share information between members of the high-energy physics community. [3] By 1991, the Web had become operational. The Web is a hypertext system. The hypertext concept was originally described by Vannevar Bush, [4] and the term "hypertext" was coined by Theodor H. Nelson. [5] In a hypertext system, a document is presented to a reader that has "links" to other documents that relate to the original document and provide further information about it. + Page 6 + Scholarly journal articles represent an excellent application of this technology. For example, scholarly articles usually include multiple footnotes. With an article in hypertext form, the reader could select a footnote number in the body of the article and be "transported" to the appropriate citation in the notes section. The citation, in turn, could be linked to the cited article, and the process could go on indefinitely. The reader could also backtrack and follow links back to where he or she started. The HyperText Transfer Protocol (HTTP) that allows Web servers and clients to communicate is older than the Gopher protocol. The original CERN Web server ran under the NeXTStep operating system, and, since few people owned NeXT computers, HTTP did not become very popular. Similarly, the client side of the HTTP equation included a terminal-based system few people thought was aesthetically appealing. [6] All this was happening just as the Gopher protocol was becoming more popular. Since Gopher server and client software was available for many different computing platforms, the Gopher protocol's popularity grew while HTTP's languished. It wasn't until early 1993 that the Web really started to become popular. At that time, Bob McCool and Marc Andreessen, who worked for the National Center for Supercomputing Applications (NCSA), wrote both Web client and server applications. Since the server application (httpd) was available for many flavors of UNIX, not just NeXTStep, the server could be easily used by many sites. Since the client application (NCSA Mosaic for the X Window System) supported graphics, WAIS, Gopher, and FTP access, it was head and shoulders above the original CERN client in terms of aesthetic appeal as well as functionality. Later, a more functional terminal-based client (Lynx) was developed by Lou Montulli, who was then at the University of Kansas. Lynx made the Web accessible to the lowest common denominator devices, VT100-based terminals. When NCSA later released Macintosh and Microsoft Windows versions of Mosaic, the Web became even more popular. Since then, other Web client and server applications have been developed, but the real momentum was created by the developers at NCSA. [7] 3.0 The Client/Server Model To truly understand how much of the Internet operates, including the Web, it is important to understand the concept of client/server computing. The client/server model is a form of distributed computing where one program (the client) communicates with another program (the server) for the purpose of exchanging information. [8] + Page 7 + The client's responsibility is usually to: o Handle the user interface. o Translate the user's request into the desired protocol. o Send the request to the server. o Wait for the server's response. o Translate the response into "human-readable" results. o Present the results to the user. The server's functions include: o Listen for a client's query. o Process that query. o Return the results back to the client. A typical client/server interaction goes like this: 1. The user runs client software to create a query. 2. The client connects to the server. 3. The client sends the query to the server. 4. The server analyzes the query. 5. The server computes the results of the query. 6. The server sends the results to the client. 7. The client presents the results to the user. 8. Repeat as necessary. This client/server interaction is a lot like going to a French restaurant. At the restaurant, you (the user) are presented with a menu of choices by the waiter (the client). After making your selections, the waiter takes note of your choices, translates them into French, and presents them to the French chef (the server) in the kitchen. After the chef prepares your meal, the waiter returns with your diner (the results). Hopefully, the waiter returns with the items you selected, but not always; sometimes things get "lost in the translation." + Page 8 + Flexible user interface development is the most obvious advantage of client/server computing. It is possible to create an interface that is independent of the server hosting the data. Therefore, the user interface of a client/server application can be written on a Macintosh and the server can be written on a mainframe. Clients could be also written for DOS- or UNIX-based computers. This allows information to be stored in a central server and disseminated to different types of remote computers. Since the user interface is the responsibility of the client, the server has more computing resources to spend on analyzing queries and disseminating information. This is another major advantage of client/server computing; it tends to use the strengths of divergent computing platforms to create more powerful applications. Although its computing and storage capabilities are dwarfed by those of the mainframe, there is no reason why a Macintosh could not be used as a server for less demanding applications. In short, client/server computing provides a mechanism for disparate computers to cooperate on a single computing task. 4.0 Uniform Resource Locator The Uniform Resource Locator (URL) is a fundamental part of the Web. It is utilized to concisely describe and identify both the protocol used by and the location of Internet resources. [9] In general, a URL has the following form: protocol://host/path/file. "Protocol" denotes the type of Internet resource. The most common are: "gopher," "wais," "ftp," "telnet," "http", "file," and "mailto" (electronic mail). "Host" denotes the name or IP (Internet Protocol) address of the remote computer (e.g., 152.1.39.42 or www.lib.ncsu.edu). "Path" is a directory or subdirectory on a remote computer. "File" is the name of the file you want to access. Using variations of this general form, you can use URLs and Web browsers to access just about any Internet resource. Here is an example of a URL for an FTP session: ftp://ftp.lib.ncsu.edu/pub/stacks/alawon/alawon-v1n04 This URL results in the following actions: 1. FTP to ftp.lib.ncsu.edu, 2. log on as anonymous, 3. change the directory to /pub/stacks/alawon/, and 4. get the file alawon-v1n04. Since Web browsers understand and implement the File Transfer Protocol (FTP), you do not have to remember all the commands necessary to do FTP. All you have to remember is how to create a URL for an FTP session. + Page 9 + Here is an example of a URL for an HTML document: http://www.lib.ncsu.edu/stacks/alawon-index.html This URL opens up a HTTP connection to www.lib.ncsu.edu, changes the directory to stacks, and retrieves the file alawon-index.html. URLs are more complicated than the general form illustrated above; URLs can also provide the means to present the logon name for Telnet connections, a communications port, an index/search query, and/or an HTML anchor. Here is an example of a URL for a Telnet session: telnet://library@library.ncsu.edu:23/ In this example, "library" denotes the logon name and "23" denotes the communications port. (Port 23 is the standard Telnet communications port.) Thus, a Web browser can initiate a Telnet session. This example opens up a Telnet connection to "library.ncsu.edu," and, depending on the user's browser, the user may be reminded to log on as "library." This URL does not use the "path" or "file" parameters because they are meaningless for Telnet sessions. On the other hand, to manually query the Geographic Name Server, the URL would be: telnet://martini.eecs.umich.edu:3000/ Since the Geographic Name Server requires no password, no password is specified; however, since the Geographic Name Server "listens" on port 3000, a nonstandard port number must be specified. WAIS searches can be specified using URLs. Unfortunately, at the present time, only NCSA Mosaic for the X Window System directly implements the WAIS protocol. WAIS URLs have the following form: wais://host:port/database?query "Port" is assumed to be 210 (the standard WAIS/Z39.50 port), "database" is the source file to search, "?" delimits the database from the query, and "query" is the your search strategy. Here is an example of a URL for a WAIS search: wais://vega.lib.ncsu.edu/alawon.src?nren + Page 10 + Gopher servers and files can be specified with URLs as well. Since Gopher resource specifications require "Type" identifiers and paths to Gopher resources often include spaces, Gopher URLs usually deviate from the norm. Here is an example of a URL for a Gopher subdirectory: gopher://gopher.lib.ncsu.edu/11/library/ Notice the pair of 1's after the Internet name of the computer. These 1's specify the resource as a directory. On the other hand, the following URL specifies a specific text file within that directory: gopher://gopher.lib.ncsu.edu/00/library/about The "00" denotes a text file. Constructing URLs is more difficult when the path and/or file names of the Internet resources contain special characters like spaces or colons. In these cases, escape codes must be used to denote the special characters. For example: gopher://gopher.lib.ncsu.edu/0ftp%3amrcnext.cso.uiuc.edu%40/ pub/etext/etext91/aesop11.txt This long URL first asks a Gopher server (gopher.lib.ncsu.edu) to FTP a file (aesop11.txt) from an anonymous FTP server (mrcnext.cso.uiuc.edu). Notice the "%3a" and "%40" in the URL. They are used to denote a colon (":") and at sign ("@"), respectfully. Furthermore, notice the zero proceeding the "ftp." This is used to identify the remote file as a text file. As you can see, Gopher URLs are particularly difficult to decipher. The easiest way to construct a URL for a Gopher item it to access the Gopher server via a Web client, traverse the Gopher menus until you locate the resource, and then copy the displayed URL from the appropriate part of your client's screen. In summary, URLs unambiguously describe the location of Internet resources. Using URLs as a standard, Internet client programs like Web browsers can interpret URLs and retrieve the desired information. URLs describe the protocols and locations of Internet resources without regard to the particular Internet client software the user is employing to access them. + Page 11 + 5.0 Example Web Client Software Four examples of Web client software are described here: MacWeb, NCSA Mosaic for Microsoft Windows, Lynx, and NCSA Mosaic for the X Window System. These particular pieces of software are described because I think they presently represent the best clients for the most common computing environments (i.e., Macintosh, Microsoft Windows, character-terminal-based VMS or UNIX, and X Window System). The real power of these Web clients (usually referred to as "browsers") is their ability to understand multiple Internet protocols. Each of the browsers described understands how to FTP files, act as Gopher clients, and read and interpret the output of Web servers. Additionally, each of these pieces of software understand "forms," an HTML extension allowing the user to complete electronic forms similar to Gopher+ ASK blocks. While none of these clients can directly understand the Telnet protocol, each can be configured to load and run Telnet software. 5.1 MacWeb As the name implies, MacWeb is a Web browser for the Macintosh. Written at the Microelectronics and Computer Technology Corporation (MCC), MacWeb is distributed via the Enterprise Integration Network (EINet). [10] MacWeb requires System 7 and at least MacTCP version 2.0.2. MacTCP is an operating system extension available from Apple Computer that allows Macintosh computers to understand the Transmission Control Protocol/Internet Protocol (TCP/IP) necessary for Internet communications. A very important piece of software called "StuffIt Expander," is strongly recommended when using MacWeb or NCSA Mosaic for the Macintosh (MacMosaic). [11] StuffIt Expander is a utility program used to translate and uncompress files; compressed files are usually retrieved via FTP archives. The advantages of MacWeb are that it is fast, has an elegant and easily customizable interface, supports the automatic creation of HTML documents from its hotlists, and indirectly supports the WAIS protocol by launching MCC's WAIS client, MacWAIS. Its disadvantages are that you cannot select and copy text directly from the screen and, when the displayed text is saved as a text file, the displayed text looses all of its formatting. + Page 12 + 5.2 NCSA Mosaic for Microsoft Windows NCSA Mosaic for Microsoft Windows is bound to be one of the more popular Web browsers since most people have or will have Microsoft Windows-based computers. [12] NCSA Mosaic for Microsoft Windows requires a WINSOCK.DLL. Like MacTCP, the WINSOCK.DLL software allows your computer to understand TCP/IP. Common WinSock packages include LAN WorkPlace for DOS and Trumpet WinSock. Additionally, NCSA Mosaic for Microsoft Windows requires the 32-bit Windows extensions (Win32s). Win32s runs on 80386, 80486, or Pentium computers. The Win32s software is available via anonymous FTP from NCSA. One of the nicest features of NCSA Mosaic for Microsoft Windows is the ability to customize its menu bar. By editing the MOSAIC.INI file, you can delete or add menu items to the menu bar. Consequently, you can configure the client and have it display commonly used Internet resources. At the present time, you cannot select nor copy text from the screen. Therefore, if you want to save displayed text, you must use the application's "Load to Disk" option. 5.3 Lynx Lynx is a basic Web browser that is intended to be used on DOS computers or "dumb" terminals running under the UNIX or VMS operating systems. [13] Lynx clients are wonderful when your only Internet connection is located on a remote computer (i.e., most dial-in access) or when you need to provide a lowest common denominator interface (e.g., VT100 terminals). Lynx clients don't support image or audio data, but they do support the "mailto" URL. Mailto URLs are used for the Simple Mail Transfer Protocol (SMTP), the Internet mail standard. When a Lynx client user selects a mailto URL, the user will be presented with a "form" to complete and the resulting text from the form will be delivered via Internet mail to the person or computer specified in the URL. + Page 13 + 5.4 NCSA Mosaic for the X Window System NCSA Mosaic for the X Window System, coupled with NCSA's Web server (httpd), really gave the Web the momentum and visibility it has today. [14] This full-featured browser supports copy and paste from the display. Direct WAIS support is also provided, and URLs such as wais://wais.lib.ncsu.edu/alawon?nren are valid. At the present time, just about the only thing it doesn't support is the mailto URL. The disadvantage of NCSA Mosaic for the X Window System is that it requires a relatively powerful computer. While a Macintosh equipped with MacX or a Microsoft Windows computer with HummingBird Communications' eXceed/W can run X Window terminal sessions, NCSA Mosaic for the X Window System really requires direct access to a UNIX or VMS machine running the X Window System software. 6.0 Example Web Server Software If you want to become a Web information provider, you need to utilize Web server software. This section describes the most popular Web server software for the most common computing platforms (i.e., Macintosh, UNIX, VMS, and Microsoft Windows). 6.1 MacHTTP MacHTTP is an Web server for Macintosh computers. [15] Written by Chuck Shotton, MacHTTP is one of the easiest servers to set up and configure. In fact, it is so easy it works "straight out of the box." MacHTTP requires System 7 to support advanced features like AppleScript. MacHTTP runs on Macintosh II-type computers (e.g., Macintosh IIci, SE/30, LC, Centris, and Quadra computers). It does not run on low-end Macintoshes based on the Motorola 68000 microprocessor (e.g., Macintosh Plus, SE, and PowerBook 100 computers). MacHTTP requires MacTCP. + Page 14 + Because of its simple installation, I recommend the use of MacHTTP to learn the basics of Web servers. Since it is so small, just about anyone can create a server on their desktop computer and effectively experiment with serving HTML documents. A Macintosh is not recommended as an institution's primary server, since the potential user population may be very large. On the other hand, a group of Macintosh servers that were linked together via the HTTP protocol to form a single virtual server could easily distribute the load, with each server supporting a subset of an institution's HTML documents. 6.2 NCSA httpd Based on the number of postings to comp.infosystems.www newsgroups, NCSA's httpd seems to be the most popular Web server. Running under the UNIX operating system, httpd is distributed both as source code and in binary form for the many "flavors" of UNIX. [16] This server is robust and only slightly difficult to configure. If you have a UNIX computer at your disposal and your server's intended audience is large, then I recommend the use of NCSA httpd. I recommend this for several reasons. First, this server is widely supported by the Internet community; you can always find an expert, and it is easier to get help for this server than for the CERN server. Second, since it runs under UNIX, it is intended to coincide with other applications running on the same computer, like Gopher, WAIS, or a list server. Finally, many Common Gateway Interface (CGI) scripts are written in Perl, a programming language most at home on a UNIX computer. (CGI scripts are described in more detail later.) 6.3 CERN httpd If you have a VMS computer, you cannot use the NCSA http server; however, there is an appropriate Web server available. It is a port of the CERN httpd server by Foteos Macrides of the Worcester Foundation for Experimental Biology. Like the servers described previously, the CERN httpd server for VMS comes in binary form as well as in source code form. [17] Configuration is not as easy as MacHTTP or NCSA httpd for Windows, but it is not any more difficult than NCSA's httpd server for UNIX. Presently, the server does not support the POST method, the preferred method of transmitting information from forms to CGI scripts, but it works just the same. One advantage of VMS is its strong scripting language, DCL. DCL is works well for CGI scripts. + Page 15 + If you plan to maintain a server, your intended audience is large, and you have a VMS computer at your disposal, then I recommend using this server software. If you have a UNIX computer, use the NCSA http server instead. 6.4 NCSA httpd for Windows Robert B. Denny has ported the NCSA httpd server to Microsoft Windows. [18] Like MacHTTP, it worked for me "right out of the box," and it supports all the standard features, such as forms, CGI scripts, graphics, and access control. Its disadvantages are that it is considered slow and it requires a lot of system resources (memory and CPU power) as well as a WinSock-compatible TCP/IP driver (just like NCSA Mosaic for Microsoft Windows). This server would make a good platform for PC users to learn the basics of HTTP and server maintenance. Like MacHTTP, I would not recommend this application as the main server of an institution, such as an academic library. 7.0 Web Servers Versus Gopher Servers There are several reasons why Web servers should be used instead of Gopher servers. First, in terms of computing resources, Web servers are more efficient since most of the information processing is distributed to the client software. A Gopher client can effectively have access to FTP and WAIS services, but the Gopher server is doing all the work. On the other hand, Web clients (for the most part) understand these protocols and take the load off the server. Second, because Web clients understand HTML, Web servers are not limited to making their information available via menus. Thus, more descriptive texts and abstracts can be added to hypertext links making it easier for the user to evaluate possible choices. Third, Web servers are significantly easier to maintain. For example, every "study carrel" of the North Carolina State University Libraries' Web server consists of a single HTML file created either with a public domain editor or via a report from a database program. This is so much easier to maintain and manage than all the link files and directories of the study carrels in the Libraries' Gopher server. + Page 16 + 8.0 HyperText Markup Language The HyperText Markup Language (HTML) is used to format documents delivered by Web servers. The formal HTML standard can be read from the CERN server, [19] and a few style guides are available from the WWW Developer's JumpStation. [20] A subset of the Standard Generalized Markup Language (SMGL), HTML's strengths and weaknesses are well documented by Price-Wilkin [21] and Barry. [22] Therefore, only a brief overview of HTML will be provided here. HTML files are simple ASCII files containing rudimentary "tags" describing the format of a document. Creating an HTML document is a lot like using the old word processing program WordStar. (Remember WordStar?) For example, to print a word in boldface type using WordStar, the user would first select text from the screen. Then the user would enter a code like "^b." This code would be inserted before and after the selected text. When the document was printed, WordStar would interpret the "^b" and print boldface letters until another "^b" was encountered. HTML works in a similar fashion. The author goes through his or her document surrounding text with special codes denoting format. Since the Web employs the client/server model, there is little control over the fonts and styles of formatted text at the client end. Therefore, HTML provides logical rather than stylistic formatting capabilities. The basic structure of an HTML document looks like this: My First HTML Document Hello, World! The and tags define the document as an HTML document; the and tags denote the leading matter of a document; the and tags specify the document's title; and the and tags specify the location of the formatted text. Notice how the second tag of each tag pair is identical to the first tag except the second tag includes a backward slash ("/"); the backward slash denotes the completion of a logical formatting option. + Page 17 + Within the body of an HTML document there can be many other formatting constructs. Examples include the

tag for paragraph marks and the
tag for simple line breaks. There are also the ordered list (

    ) and unordered list (