_Current Cites_
               Volume 10, no. 11 November 1999
       The Library University of California, Berkeley
                   Edited by Teri Andrews Rinne

                       ISSN: 1060-2356
  http://sunsite.berkeley.edu/CurrentCites/1999/cc99.10.11.html

      Contributors: Terry Huwe, Michael Levy, Leslie Myrick,
         Margaret Phillips, Jim Ronningen, Lisa Rowlison,
                   Roy Tennant, Lisa Yesson
   
   Carnevale, Dan. "Web Services Help Professors Detect Plagiarism" The
   Chronicle of Higher Education
   (http://www.chronicle.com/free/v46/i12/12a04901.htm) - The Web has
   brought a double-edged sword into conventional and distance-education
   classrooms alike: easy access to digital information can mean
   increased access to plagiarizable information, whether in the form of
   online encyclopedia articles or from the growing online term-paper
   market. Moreover, "copying" bits of somebody else's work is now as
   arduous as cutting and pasting text. Ironically, the same nexus of
   search engines that students use to find articles online can be tapped
   by instructors to sniff out those "hauntingly familiar" or "overly
   ornate" passages. But while entering the offending phrases into a
   text-rich search engine is infinitely easier than a trip to a
   bookstore or library to pore through Cliff's Notes or the Encyclopedia
   Britannica, most instructors don't have the time to surf for purloined
   bits. Enter web entrepreneurship in the shape of companies such as
   Plagiarism.org, or IntegriGuard.com, which maintain databases of
   papers culled from various sources; the former also offers to send
   papers through a multiple search-engine gamut. Plagiarism.org's
   resulting originality report highlights suspect passages of eight
   words or more and provides a link to the web text it matches. In the
   manner of a badly concealed speed-trap, prevention may lie at least
   partially in the fact that professors openly register their students
   and in some cases students upload their own papers for scrutiny.
   Astonishingly, however, despite fair warning, in one early case study
   in a class held at UC Berkeley, some 45 papers out of a total of 320
   were found to contain "suspicious passages". - LM
   
   Coombs, Norman. "Enabling Technologies: New Patrons: New Challenges"
   Library Hi Tech 17(2) (1999): 207-210. - In his regular column on
   enabling technologies for the "print disabled"- those who are dyslexic
   and those who cannot hold and manipulate books - Coombs aims to
   highlight the hardware and software tools that libraries need utilize
   in order to make electronic resources accessible to the widest
   possible range of users. His aim is to "persuade librarians that
   taking on this new task will be a challenge and opportunity rather
   than another burden." As a blind professor, Coombs discusses his
   initial work with a speech synthesizer to access an online catalog
   through to the capability to read web documents. In particular he
   discusses IBMs most recent web browser for special needs patrons
   called Home Page Reader Version 2.0. Using the numeric keypad and a
   combination of individual other keys the user can send commands to the
   program. What makes HPR more useful than simple screen readers is that
   it allows for comprehensive HTML handling and navigation, so that it
   will deal with frames, tables, forms list and menus. Unlike regular
   screen readers it actually examines the HTML code itself but
   unfortunately does not handle Java. In Coombs informative review he
   has effectively highlighted some issues that should be of concern to
   all librarians. - ML
   
   Ganesan, Ravi. "The Messyware Advantage" Communications of the ACM
   42(11) (November 1999) - Librarians and other information organizers,
   take heart - we're messyware and we're indispensable. Playing devil's
   advocate, the author starts by describing the Internet commerce
   scenario which so many digital pundits espoused not long ago: a direct
   link between producer and consumer, with the hated middleman
   eliminated. In questioning why the opposite seems to be happening when
   we place a high value on a new kind of dot-com middleman such as
   Amazon or Yahoo, he introduces his concept of messyware, which he
   describes as "the sum of the institutional subject area knowledge,
   experienced human capital, core business practices, service, quality
   focus and IT assets required to run any business." Why the term
   "messyware"? While a software solution may be all you need when
   systems are running perfectly, real life tends to get messy. (The
   photographs accompanying the text get this point across admirably.
   They depict people on a rainy streetcorner buying cheap umbrellas from
   a roving umbrella salesman. Thanks to this middleman, they are getting
   exactly what they need, when and where they need it, and would
   certainly not benefit by cutting out the middleman and going directly
   to the source.) Ganesan, bless him, uses libraries as an example of
   the value of expert intermediation which can deal with the infomess.
   His primary focus is on business, but there is plenty to ponder here
   for all information professionals, including strategic pointers for
   leveraging the messyware advantage. This article is just one of many
   fascinating pieces on information discovery, the issue's special
   theme. - JR
   
   Jones, Michael L. W., Geri K. Gay, and Robert H. Rieger. "Project
   Soup: Comparing Evaluations of Digital Collection Efforts" D-Lib
   Magazine (November 1999)
   (http://www.dlib.org/dlib/november99/11jones.html). - The Human
   Computer Interaction Group at Cornell University has been evaluating
   particular digital library and museum projects since 1995. In this
   article they discuss their findings related to five projects (three
   museum and two library). Their conclusions include: Effective digital
   collections are complex sociotechnical systems; Involve stakeholders
   early; Backstage, content and usability issues are highly
   interdependent; Background issues should be "translucent" vs.
   transparent; Determine collection organization, copyright, and
   quantity goals around social, not technical or political, criteria;
   Design around moderate but increasing levels of hardware and user
   expertise; "Market" the collection to intended and potential user
   groups; and, Look elsewhere for new directions. - RT
   
   Lewis, Peter H. "Picking the Right Data Superhighway" New York Times
   (http://www.nytimes.com/library/tech/99/11/circuits/articles/11band.ht
   ml) - For surfers seeking that tubular high-bandwidth download, there
   is now more than one wave to catch (depending, of course, on
   availability), each with its own advantages and pitfalls. This article
   examines three modes of high-bandwidth Internet service: cable modem,
   DSL and satellite data services. Lewis was in the lucky position
   (Austin, TX; expense account) to test all three, using as his criteria
   speed, performance, price, security, and choice of ISP services. His
   assessment(your results may vary): while any of the three is
   preferable to an analog modem insofar as the connection is always on,
   satellite data services can be easily factored out for all but the
   most remotely situated users due to huge financial outlays, from
   hardware to installation to monthly fees and possible phone charges to
   distant ISP providers. Speed is also an issue, at a "measly" 400 kbps.
   Cable modems, while they offer theoretically the speediest of
   connections: (30 mbps possible), suffer from "Jekyl-and-Hyde"-like
   yawls in performance, since cable is a shared resource. The more
   neighbors to whom you gloat over your wealth of bandwidth, the worse
   it will become. A more likely figure is 1 mbps. You may also find you
   have security concerns. DSL, on the other hand, has a dedicated line,
   so there are no security problems. But it is hands down the costlier
   alternative. Moreover, outside of a radius of 17,500 feet from the
   phone company's central office (or about 3 miles), performance suffers
   significantly, unless you are willing to pay extravagant sums. Data is
   loaded at somewhat slower speeds than cable's best numbers: download
   can run from 384 kbps to 1.5 mbps, with upload consistently logy at
   128kbps. All these considerations aside, Lewis goes with DSL. The
   deciding factor is often in the details: having to deal with the
   telephone company vs the cable company, the choice of ISPs (in the
   case of cable modems, practically nonexistent), and so on. - LM
   
   Malik, Om. "How Google is That?" Forbes Magazine
   (http://www.forbes.com/tool/html/99/oct/1004/feat.htm) Walker, Leslie.
   ".COM-LIVE" (The Washington Post Interview with Sergey Brin, founder
   and CEO of Google)
   (http://www.washingtonpost.com/wp-srv/liveonline/business/walker/
   walker110499.htm) - For those users of the recently-launched search
   engine Google (http://www.google.com/) who have consistently found its
   searching and ranking facilities spot on, and wondered, "How do they
   DO that?", two recent articles offer some answers; but the algorithm
   remains a mystery. With the backing of the two biggest venture capital
   firms in the Silicon Valley, and a PC farm of 2000 computers, another
   boy-wonder team out of Stanford has revolutionized indexing and
   searching the Web. The results have been so satisfying that Google
   processes some 4 million queries a day. Google, whose name is based on
   a whimsical variant of googol, i.e. a 1 followed by 100 zeroes, claims
   to be one of the few search engines poised to handle the googolous
   volume of the Web, estimated to be increasing by 1.5 million new pages
   daily. It uses a patented search algorithm (PageRank technology) based
   not on keywords, but on hypertext and link analysis. Critics describe
   the ranking system as "a popularity contest"; the Google help page
   prefers to characterize it in terms of democratic "vote-casting" by
   one page for another (well, some votes "count more" than others ...).
   Basically, sites are ranked according to the number and importance of
   the pages that link to it. In a typical crawl, according to Brin,
   Google reads 200 million webpages and factors in 3 billion links.
   Decidedly NOT a portal, when Google came out of beta in late September
   the only substantive change made to the fast-loading white page
   inscribed with the company name and a single query textbox was a
   polished new logo. A helpful newish feature is Googlescout, which
   offers links to information related to any given search result. There
   are also specialized databases of US government and Linux resources.
   It appears that the refreshing lack of advertising on its search page
   will not last forever: in the works is a text-based (rather than
   banner-based) "context-sensitive" advertising scheme, generated
   dynamically from any given query. - LM
   
   Miller, Robert. "Cite-Seeing in New Jersey" American Libraries 30(10)
   (November 1999): 54-57. - Tracking down fragmentary citations or
   hard-to-locate material is a classic library service. But in this
   piece Miller highlights how the tools for performing this service have
   changed. Classic citation-tracking resources are still used, but now
   the Web can be used as well. A few interesting anecdotes illustrate
   how a little imagination, experience, and perseverance can make the
   Internet cough up the answer when the usual resources fail. Miller
   illustrates how the best librarians are those who can absorb new tools
   into their workflow as they become available, and therefore become
   more effective at their job. - RT
   
   netConnect. Supplement to Library Journal October 15, 1999. This very
   slim but incredibly pithy supplement to LJ is modestly subtitled "The
   Librarian's Link to the Internet". I doubt anyone needs this
   publication to get online, but the point is taken. It is aimed at
   bringing focused information regarding the Internet to LJ's audience.
   And if this first issue is any indication, they will be successful
   doing it. Contributions to this issue include Clifford Lynch on
   e-books (an absolute must-read for anyone interested in this
   technology), a couple pieces by Sara Weissman, co-moderator of the
   PubLib discussion, an article on net laws from an attorney at the
   Missouri Attorney General's Office, a practical article on creating
   low-bandwidth web images without sacrificing quantity and quality, and
   an article on Web-based multimedia from Pat Ensor, among others. This
   is a solid publication that I cannot wait to see again. Disclosure
   statement: I am a Library Journal columnist. - RT
   
   Pitti, Daniel. "Encoded Archival Description: An Introduction and
   Overview" D-Lib Magazine (November 1999)
   (http://www.dlib.org/dlib/november99/11pitti.html). - Encoded Archival
   Description (EAD) is a draft standard SGML/XML Document Type
   Definition (DTD) for online archival finding aids. In this overview
   article, the father of EAD explains what it is, why it exists, and
   what future developments may lie in store. - RT
   
   Planning Digital Projects for Historical Collections in New York State
   New York: New York State Library, New York Public Library, 1999
   (http://digital.nypl.org/brochure/). - This brochure serves as a
   useful high-level introduction to digitizing historical collections.
   Following a brief history of New York Public Library's digitization
   projects, it dives into the heart of the matter -- planning a
   digitization project. Main sections include: What does a digital
   project involve?; Why undertake a digital project?; How to plan for
   digital projets; How to select collections and materials for a digital
   project; How to organize information; and, How to deliver materials
   effectively. A brief list of resources is also included. Before
   getting started in such a project you will need to do much more
   reading than this, but it nonetheless is a useful place to start -- in
   either it's print or web format. - RT
   
   Seadle, Michael. "Copyright in the Networked World: Email Attachments"
   Library Hi Tech 17(2) (1999): 217-221. - Seadle takes two commonplace
   uses of copying and evaluates whether they are legally acceptable in a
   digital environment. He gives a brief overview of the four keys test
   for determining "fair use" before discussing the specific cases. The
   first case is that of a faculty member distributing via email an
   article from the online interactive edition of the Wall Street Journal
   to his entire class. He had previously done similar things with the
   print version of the Journal and felt that this new use was still fair
   use. Unfortunately it would appear that the ability to make a full and
   perfect reproduction of a digital document destroys any barriers to
   further copying by students and would invalidate a fair use
   justification of this practice. In the second scenario a reference
   librarian sends via email a list of citations and full-text articles
   to a patron from the FirstSearch database. The librarian decided that
   if she deleted her copy of the downloaded documents that the end user
   would be complying with specific language in the database allowing for
   the downloading and storing documents for no more than 90 days. The
   differences are the librarian is sending the information to one person
   and not to a class, and the patron could have found the articles
   himself. So in essence the library was making an allowable copy for
   the user. Seadle admits that his arguments are not conclusive or
   exhaustive but in a clear way he outlines two interesting, yet normal
   copyright situations facing librarians and faculty. - ML
   
   "Tomorrow's Internet" The Economist 353 (8145) (November 13, 1999):23 
   (http://www.economist.com/editorial/freeforall/19991113/index_sa0324.html).
   - The cover story of this issue of The Economist focuses on the
   aftermath of the now-notorious "findings of fact" in the Microsoft
   antitrust case. This related article describes in detail the emerging,
   network-intensive style of computing that may reduce or eliminate the
   need for costly operating systems like Windows. Look no further for a
   balanced treatment of the forces behind "open system" computing, "thin
   clients", netcomputers and the like. As with all their technology
   reporting, the editors rely on plain English and disdain technobabble.
   - TH
     _________________________________________________________________
   
   Current Cites 10(11) (November 1999) ISSN: 1060-2356
   Copyright (c) 1999 by the Library, University of California,
   Berkeley. _All rights reserved._
   
   Copying is permitted for noncommercial use by computerized bulletin
   board/conference systems, individual scholars, and libraries.
   Libraries are authorized to add the journal to their collections at no
   cost. This message must appear on copied material. All commercial use
   requires permission from the editor. All product names are trademarks
   or registered trade marks of their respective holders. Mention of a
   product in this publication does not necessarily imply endorsement of
   the product. To subscribe to the Current Cites distribution list, send
   the message "sub cites [your name]" to listserv@library.berkeley.edu,
   replacing "[your name]" with your name. To unsubscribe, send the
   message "unsub cites" to the same address. Editor: Teri Andrews Rinne,
   trinne@library. berkeley.edu.