Internet and the World Wide Web

views updated

INTERNET AND THE WORLD WIDE WEB

In the 1980s, if someone had asked a friend how they kept in touch with a classmate, that friend would have responded "by phone" or "by mail" or perhaps "not often enough." In the 1980s, if someone wanted to find a movie review, they looked in the newspaper; a recipe, they looked in a cookbook; the sports scores, they turned their television to ESPN. Today, the way people do some or all of these things is likely to be drastically different, yet all are variations on a theme: "used e-mail," "looked it up on the Internet," "did a search on Yahoo," or "went to ESPN.com to get the score." The common theme, "the Internet" or "the web," is arguably one of the most potent political, economic, and social forces created in the twentieth century.

Definitions

What is the Internet? Wendy Lehnert (1998, p. 21) calls it "a global assemblage consisting of over 19 million computers in rapid interconnection," while Douglas Comer (1991, p. 493) describes the Internet as "The collection of networks and gateways… that use the TCP/IP [transmission control protocol/Internet protocol]… suite and function as a single, cooperative virtual network." These definitions differ in their specificity, yet both are still correct. While the roots of the Internet can be traced to a research project sponsored by the U.S. government in the late 1960s, the Internet today bears little resemblance to the Internet of the early 1970s or even the early 1980s. Drastic increases in computer power, network capacity, and software capability, accompanied by similar reductions in cost, have put Internet capability into an ever-growing set of hands. Access that was limited to a few research universities in the United States in the early 1980s has now expanded to libraries, schools, businesses, and especially private homes. In 1997, the Computer Industry Almanac estimated that more than 200 million people worldwide would be "Internet connected" by the year 2000. The Internet defies simple definition in part because it changes so rapidly. For example, Lehnert's mention of nineteen million computers was correct in July 1997, but by the same month three years later, according to the Internet Software Consortium (2001), her statement needed to be amended to "ninety-three million computers" in order to be accurate.

The World Wide Web (WWW or web) also defies simple definition. Tim Berners-Lee and his colleagues (1994) defined it generally as "the idea of a boundless information world in which all items have a reference by which they can be retrieved." It is interesting to note that in almost the next sentence they describe it much more narrowly as the body of data accessible using specific addressing mechanisms and access technology. Thus, no less of an authority than one of the founders of the web speaks of it specifically and narrowly, and also very generally as a vision or "idea." Both are reasonable ways to think about the web.

Only one-third the age of the Internet, the web is nonetheless inextricably interwoven with the Internet. For many people there is essentially no difference between the two, because their sole interface to the Internet is through the "web browser," the software that runs on a home computer and is the mechanism through which a person reads e-mail, chats with friends, or surfs the web. While the Internet and the web are technically separate entities, with separate histories and attributes, they also now have an almost symbiotic relationship. The web takes advantage of Internet services, standards, and technology in order to request, find, and deliver content. The governance, engineering, and growth of the Internet are largely enabled through the use of web technologies.

History

As noted by Comer (1997) and Daniel Lynch (1993), the Internet has its roots in a networking research project that was started in the late 1960s by the U.S. government's Advanced Research Project Agency (ARPA). The original ARPANET comprised only four sites, three academic and one industrial. As work progressed, more sites were added, and by the mid-1970s the ARPANET was operational within the U.S. Department of Defense.

By the late 1970s, it was becoming clear that a major barrier to expanding the ARPANET was the proprietary nature of the communication protocols used by computers that would need to be connected. This observation sparked the creation and deployment of an open set of communication standards that came to be known as the TCP/IP suite. In the mid-1980s, the National Science Foundation (NSF), interested in providing Internet access to a larger number of universities and researchers, built the NSFNET, which augmented the ARPANET and used the same common set of communication standards. By 1990, ARPANET had been officially disbanded, replaced by the NSFNET and a set of other major networks all running TCP/IP. Collectively, this was the Internet. In the mid-1990s, NSF got out of the network management business entirely, leaving the administration of the current set of physical networks that comprise the Internet to a small set of commercial concerns.

Web Beginnings

In 1989, Berners-Lee, a computer scientist at CERN, the European Particle Physics Laboratory in Switzerland, proposed a project to help that organization manage the vast quantities of information and technical documents that it produced. The fundamental idea was to create an information infrastructure that would allow separate departments within the laboratory to make documentation electronically accessible, to allow decentralized maintenance of the information, and to provide a mechanism for linking between projects and documents. That infrastructure, originally called "Mesh" and later the "World Wide Web," received internal funding in 1990, and Berners-Lee developed a prototype that ran on a computer called the NeXT. In 1991, Berners-Lee demonstrated the prototype at a conference in San Antonio, Texas. As related by Joshua Quittner and Michelle Slatalla (1998), by late 1992 the web had piqued the interest of Marc Andressen and Eric Bina, two programmers at the National Center for Supercomputing Applications (NCSA). In short order they wrote the first widely available, graphics-capable web browser, called Mosaic.

The Web Takes Off

Mosaic was free and available to anyone to download from NCSA. It initially only ran on machines running the Unix operating system, but by late 1993, versions developed by other programmers at NCSA were available for both the Apple MacIntosh and the International Business Machines (IBM) personal computer (PC). In the space of months, the World Wide Web went from essentially having academic interest to having mass-market appeal. By mid-1994, the team of programmers at NCSA had parted en masse for California to help form what soon became Netscape Communications. By the end of that year, they had built the first version of Netscape Navigator, and in 1995, they quickly captured more than 75 percent of the browser market. Both Quittner and Slatalla (1998) and Jim Clark (1999) provide details on this exciting time in Internet history as viewed from the Netscape perspective.

The late 1990s has been characterized by continued explosive growth, the introduction of important technologies such as the Java programming language, live video and audio, the widespread use of secure connections to enable online commerce, and continually more elaborate methods to describe web content. The emergence of Microsoft as a major influence has also had a large effect on the growth of the Internet.

Technical Underpinnings

There is a core set of technical capabilities and standards that allow the Internet to function effectively. It is useful to discuss the technical infrastructure of the "net" by trying to answer the following questions. What does the network actually look like? How does data get transferred from one computer to another? How do computers communicate with each other and "agree" on a communication language?

The Internet Is Packet Switched

When a person calls someone on the telephone, the telephone company sets up a dedicated, person-to-person circuit between the caller and the person being called. The physical infrastructure of the telephone company is set up to provide such a circuit-switched network—it means that before connecting the call, the company must reserve network resources along the entire connection path between the two individuals. On the other hand, as described by Larry Peterson and Bruce Davie (2000), a packet-switched network, of which the Internet is the most well-known example, is composed of links and switches (also called routers). A router can have multiple incoming links and multiple outgoing links. Data is encapsulated as a series of packets, each marked with a source and destination address. Each packet is passed along from link to link through a series of switches, where each switch has a table that indicates, by destination address, which outgoing link a packet should be sent along. Unlike a circuit-switched network, a packet-switched network such as the Internet provides no end-to-end reservation of resources. This means that packets from the same source machine can travel different paths in the network, they can be lost along the way, and they can arrive at the destination in a different order than the order in which they were sent.

The standard for the interchange of packets on the Internet is called the Internet protocol (IP). The standard describes exactly what an IP-compliant packet should look like. For example, each packet must contain control information such as which computer sent the packet, which machine is supposed to receive it, and how long the packet has been "in the network." Of course, the packet also contains data as well. IP also describes how adjacent links on the packet-switched network should communicate, what the error conditions are, and many other details. It is important to note that IP is about packets and not about end-to-end connections.

Reliable Delivery

By definition, a circuit-switched network provides a reliable, end-to-end communication channel between two parties. With the possibility of lost data and out-of-order delivery, a packet-switched network such as the Internet must have some other method of providing a reliable channel that runs "on top of" IP. As noted by Comer (1991), this is the function of the transmission control protocol (TCP). TCP software running at both ends of a communication channel can manage all of the details of chopping a large outgoing message into IP packets and of putting the packets back together at the other side. By using TCP software, higher-level services such as e-mail, file transfer, and web data are shielded from the messiness associated with packet transfer.

Names and Addresses

If an individual writes a letter to a friend, it is easy to remember the name of the recipient, but it is typically harder to remember the full address. That is why address books exist. What is more, the postal service needs the right address to deliver the letter. If an individual puts a friend's name on the letter with nothing else, then the postal service will not know what to do with it, unless the friend's name is Santa Claus or "President of the United States." The situation on the Internet is not much different. The TCP/IP layer of the Internet does not know directly about names; rather, it needs a specific address in order to determine how to route data to its eventual destination.

Every computer that is attached to the Internet must be addressable. That is, it must have a unique identifier so that data destined for that computer can be tagged with the proper address, and routed through the network to the eventual destination. The "Internet address," also called an "IP address" or "IP number," is typically expressed as a sequence of four numbers between zero and 255, for example, <152.2.81.1> or <209.67.96.22>. If a person could examine any set of arbitrary packets being routed on the Internet, then every single one of them would have a destination and source IP number as part of the control information.

In theory, there are more than four billion IP addresses. In reality, there are vastly fewer addresses than that available because of the way that addresses are distributed. In order for the network to function efficiently, IP addresses are given out in blocks. For example, if a new company needs a bunch of IP numbers, then the company asks for one or more sets of 256 number blocks, and these are assigned to the company, although the company may ultimately use fewer blocks than have been allocated. As noted by Lyman Chapman (1993), distribution of IP addresses was even more inefficient in the past. An organization could ask for a 256-number block (a category "C" number) or a 65,536-number block (a category "B" number) but nothing in between. This led to many organizations owning a category B number but using only a fraction of the available range.

In general, humans do not deal with Internet addresses. For one thing, they are much harder to remember than names such as <www.espn.com> and <www.unc.edu>. However, the IP layer of the network, the place where all the data gets packetized and transported, does deal with IP addresses. Thus, there must be (and is) a translation service available that allows a human-readable name such as <ruby.ils.unc.edu> to be mapped to a particular IP address, such as <152.2.81.1>. This service is called the domain name service (DNS). When a person sends an e-mail message to a friend at <email.cool_u.edu>, the mail service must first consult a DNS server in order to get the Internet address of <email.cool_u.edu>. Once this is done, the mail software can establish a connection with a mail server at <email.cool_u.edu> by using the DNS-supplied address.

As Paul Albitz and Cricket Liu (1997) describe it, the "names" in the DNS server are arranged in an inverted tree structure, starting with a relatively small number of top-level domains, for example, <com>, <edu>, <org>, and <net>. There are considerably more second-level domains within a single top domain. For example, there are thousands of college and university domains that end in.edu, including <unc.edu<unc.edu> (University of North Carolina, Chapel Hill), <evergreen.edu> (Evergreen College), and <vt.edu> (Virginia Tech). The process continues with different departments at a university receiving third-level domains, such as <ils.unc.edu> and <chem.unc.edu>. Machines in those departments then receive names such as <ruby.ils.unc.edu> or <www.chem.unc.edu>.

One important feature of the DNS system is that it allows for distributed zones of control. Put simply, this means that if a university wants to connect a new department to the Internet, the new department need only get the university to establish a new administrative zone and assign a subblock of IP addresses to the new zone. The new department would then be free to choose their own computer names and assign particular addresses from their subblock to those machines, all without consulting the university or any higher-level authorities. DNS also has easy mechanisms handling new and changed names.

The importance of having separate names and addresses cannot be overstated. It makes it easier for individuals to remember the name of a website or the e-mail address of a friend. Just as important, it allows flexibility in assigning a name to a computer. For example, an organization can keep its brand name (<www.nike.com>), but as necessity dictates, they can change the machine that hosts and serves web content.

Clients, Servers, and Protocols

A fundamental concept that is important for understanding how the Internet works is that of the client and the server. Client-server computing revolves around the provision of some service. The user of the service is the client, and the provider of the service is the server. Client and server communicate using a prescribed set of interactions that together form a protocol (i.e., the rules for interaction). To illustrate, consider a person who enters a favorite fast-food restaurant to order fries and a hamburger. Typically, the person (the client) waits in line, makes an order to the cashier (the server), pays, waits for the food, and has the food served to him or her on a tray. Optionally, the person may choose to leave before ordering or decide to ask for extra ketchup. This set of required and optional interactions together form the protocol. On the Internet, the client and server are typically computer programs. Accordingly, the protocol itself is very prescriptive because computers are not tolerant of subtle errors that humans can easily accommodate.

One way to think about the Internet is as a set of services provided to both humans and computers. For example, consider a simple interaction such as sending an e-mail. There are several services needed to accomplish this task. First, because the friend has an e-mail address such as <[email protected]_u.edu>, a service is needed to figure out the location of <email.cool_u.edu>. A DNS provides this. Second, once the e-mail software has the location of <email.cool_u.edu>, it must send the message to some entity that has access to Jamie's inbox. This entity, a program running on <email.cool_u.edu>, is called the mail server. Finally, the message itself is transported using the service that provides reliable in-order delivery of data: TCP software. If one were to build a stack of Internet services that illustrated this example, it would look much like Figure 1, with the basic services at the bottom (wires and cables) and the high-level services (the software used to compose the e-mail message) at the top.

Web browsers such as Netscape's Navigator and Microsoft's Internet Explorer are multipurpose clients. An individual can send e-mail, read news, transfer files, and, of course, view web content all by using a single browser. These programs understand multiple protocols and are thus clients for all of the corresponding services.

Growth

The growth of the Internet and of the World Wide Web has been nothing short of explosive. In fact, the number of host computers on the Internet (essentially, the number of computers that have an Internet address), doubled every fourteen months between 1993 and 1999 (Internet Software Consortium, 2001; Netcraft Ltd., 2001).

Data on the growth of the web is less reliable and involves culling information from multiple sources. One reasonable measure of growth is the number of available web servers. Data through 1996 were obtained from Mathew Grey, a student at the Massachusetts Institute of Technology (MIT), who built software that specifically tried to count the number of servers. Later numbers are available from Netcraft Ltd. (2001) in England. The number of servers rose from thirteen in early 1993 to an estimated seven million in July 1999. Over those six years, the number of servers doubled every three and one-half months.

Governance

The Internet is governed by the Internet Society (ISOC), a nonprofit organization. Members of the society include companies, government agencies, and individuals, all with an interest in the development and viability of the Internet. Within the society, there are subgroups that manage the technical infrastructure and architecture of the Internet, as well as several other areas.

One of these subgroups, the Internet Engineering Task Force (IETF), manages the short-and medium-term evolution of the Internet. Through various working groups within the IETF, new standards are drafted, examined, prototyped, and deployed. The process is geared toward selecting the best technical solutions to problems. The output of these various efforts are technical documents called "Requests for Comments" (RFCs). As noted by Comer (1997), "Request for Comments" is a misnomer, as an approved RFC is much more akin to a standard than to something open for debate. For example, there are RFCs that describe all of the various common protocols used on the Internet, including TCP, IP, Mail, and the web's data transfer protocol, hypertext transfer protocol (HTTP). The IETF does most of its business through e-mail and online publishing. All RFCs are available online at a number of places, including the ITEF website.

Web Basics

The discussion so far in this entry has assumed some basic terminology and concepts. It is now time to be more specific about things such as "hypertext" and "website."

Hypertext and Web-Pages

The notion of hypermedia was first suggested by Vannevar Bush (1945) when he described his idea of a "Memex"—a machine that would allow a person to organize information according to their personal tastes and provide for linkages between pieces of information. In the mid-1960s, Douglas Englebart and Ted Nelson further developed the notion of "hypertext," the idea that pieces of text could be linked to other pieces of text and that one could build these linkages to arrange information in different ways, not in just the traditional left-to-right, top-to-bottom method.

A web-page is a single document that can contain text, images, sound, video, and other media elements, as well as hyperlinks to other pages. A website is a group of pages organized around a particular topic. Typically, all pages connected with a website are kept on the same machine and are maintained by the same individual or organization.

The structure of a typical web-page reflects the influences of all of this earlier work. The hypertext markup language (HTML), the web's content description language, allows content developers to specify links between different pieces of information very easily. The hyperlink can refer to information in the same document or to information that physically resides on another page, on another machine, or on another continent. From the user's perspective, all such links look and function the same. The user clicks on the link and the referred page is displayed, which itself may have links to other pieces of information in other places. This is one of the fundamental ideas behind the web.

Uniform Resource Locators

A uniform resource locator (URL, pronounced either by spelling out the letters U-R-L or by making it a word that rhymes with "pearl") is the closest thing a user has to the address of an information item. When a user clicks on a link in a web-page, the web browser must determine which Internet-connected computer to contact, what resource to ask for, and what language or protocol to speak. All of these pieces are represented in the URL.

Consider the following typical URL: <http://www.ils.unc.edu/viles/home/index.html>. The first part of the URL, <http> in this case, represents the protocol the browser should speak— hypertext transfer protocol (HTTP) in this case. HTTP defines the set of interactions that are possible between web servers and web browsers. The next piece of the URL, <www.ils.unc.edu>, is the domain name of the machine that is running the information server that has access to the resource. The last piece, </viles/home/index.html>, is the address of the resource on that machine. In this case, this is the path to a particular file on that machine. Other protocols are possible, including <ftp>, <telnet>, and <mailto>, but the vast majority that an individual encounters during searching or "surfing" the web are ones accessed using the web's native data-transfer protocol, HTTP.

Unfortunately, as anyone who has bookmarked a "stale" page or done a search that yielded "dead" links knows, URLs are not guaranteed to be permanent. They are truly addresses in the sense that the item can move or be destroyed or the server can "die." There is considerable effort in the WWW community to create permanent URLs (PURLs) or uniform resource names (URNs) for those items that are truly supposed to last, but there had been no widespread deployment as of the year 2000.

Search Engines

As the web grew, it quickly became apparent that finding information was becoming increasingly difficult. One early method used to organize information was the web directory, an organized list of new and interesting sites that a person could use as a jumping-off point. The web soon grew too large to keep up with in this fashion. In 1994, the first search engines started to appear. As Lehnert (1999) notes, a search engine is a website that provides searchable access to a large number of web-pages. Search engines work using automated web-page harvesters called robots or spiders. The robot starts with a set of pages and fetches these from their locations. Each fetched page has links to other pages. By successively harvesting pages and following links, a search engine can build a very large set of web-pages. With appropriate database and search capability, these pages form a searchable archive of web-pages.

As anyone who has ever performed the same search on multiple search engines knows, the results can vary tremendously, both in the identity of the documents returned as the "hit list" and in the usefulness of the hits in satisfying the user's information need. There are many reasons for this, all of which can contribute to variability in results. First, search engines use different methods for deciding how to include a web-page in their indexes. For example, Yahoo (<www.yahoo.com>) provides a human-categorized list of websites with both search and directory capability, while Alta Vista (<www.altavista.com>) has automated methods for building their searchable database of pages. Second, search engines index vastly different numbers of documents, so documents in some engines are simply not available in others. In data compiled by Lehnert (1999), Infoseek (<www.infoseek.com>) includes about thirty million pages, while AltaVista includes almost five times that number. Third, each engine uses different information-retrieval techniques to build their searchable databases. Some sites weigh certain parts of the document higher than other sites: for example, one search engine may use descriptive HTML meta tags that have been assigned by the author of the page, while another engine will ignore them. Some search engines give higher weight to a page that has lots of links to it, so-called link popularity, while other engines do not consider this at all. Finally, some search engines will consider the entire text of the document when they build their indexes, while others may only consider the title, headings, and first few paragraphs.

While search engines are extremely important and useful, they do not contain all of the information available on the Internet or even on the World Wide Web. For example, newly created pages do not appear because it can take a long time for search-engine spiders to find the pages, if in fact they are ever found. Information contained in web-accessible databases is not included either. This includes the millions of items available in online stores such as Amazon.com. For businesses trying to get visibility or to make money through online sales, being able to be found by new customers is very important. This means that getting ranked highly in a search engine is very desirable. Knowledgeable businesses create their websites with detailed information on how the main search engines work.

Browser Wars

The success and effect of the Internet and the web initially caught the Microsoft Corporation flat-footed, though as related by Paul Andrews (1999), the company was starting to develop a vision for modifying their software products to embrace the Internet. By late 1995, Microsoft responded and started releasing versions of Internet Explorer that eventually rivaled Netscape's Navigator in ease of use and features. Thus started the famous browser wars. As noted by Ian Graham (1997), one tactic that both companies used was to introduce HTML extensions that only worked in their own browsers. As market share for both browsers approached equality, these tactics left content providers in a quandary, as they needed to describe their content in a way that worked in both browsers. As of late 1999, standards organizations such as the World Wide Web Consortium had remedied some of the incompatibilities, but accommodating differences in browsers continued to occupy a considerable amount of time for content developers.

One feature of the computer landscape at this time was that Microsoft enjoyed huge penetration into the installed base of computers. According to data compiled by Michael Cusumano and David Yoffie (1998), more than 85 percent of all computers in the world ran versions of Microsoft's operating system, the software that controls the computer. One part of the Microsoft strategy was to enter into agreements with computer manufacturers to preload new computers with Internet Explorer. At the same time, Microsoft was developing versions of the browser that were more tightly coupled with the operating system. According to some observers, these tactics went too far in that computer manufacturers were required to preload Internet Explorer or suffer unwanted consequences. Microsoft denied the charges of attempting to exert such leverage, stating in part that the tighter integration of the browser with the operating system was part of their business strategy and to forbid this was to limit their ability to provide innovative products.

On May 18, 1998, the U.S. Department of Justice and the attorneys general from twenty states filed suit against Microsoft, alleging first that Microsoft was a monopoly under the Sherman Antitrust Act of 1890 and second that Microsoft had engaged in unfair business practices (including illegal tying of products, in this case tying the use of Internet Explorer with the use of the computer's operating system). In November 1999, Judge Thomas Penfield Jackson found that the Department of Justice had proved that Microsoft was indeed a monopoly and that it had used that power to the disadvantage of American consumers. Accordingly, on April 28, 2000, the Department of Justice filed a proposal in federal court to split Microsoft into two separate companies, one to make and sell the Windows operating system, the other to handle the company's other software products. Judge Jackson accepted this decision on June 7, 2000. Microsoft vigorously pursued an overturning decision through the U.S. Court of Appeals. In support of this strategy, on September 26, 2000, the U.S. Supreme Court ruled 8-1 against a Department of Justice request to bypass the appeals court, effectively extending the case for at least a year.

The Future Internet

It is difficult and probably fruitless to speculate on what the Internet will look like in the future. In December 1995, no less an authority than Robert Metcalfe, an Internet pioneer and the inventor of Ethernet, predicted that the Internet would collapse in 1996. At a conference in 1997, Metcalfe literally ate his words, putting his written comments in a blender and drinking them like a milkshake as his audience cheered. It is safe to say that the Internet and World Wide Web will continue to grow, probably at the same breakneck speed. If Charles Goldfarb and Paul Prescod (1998) are to be believed, HTML, the workhorse of content description, will be at least augmented, if not replaced, by the more powerful families of languages defined by the extensible markup language (XML). Advances in wireless technology and portable computing devices mean that web browsing will no longer be relegated to the desktop. Greater network capacity pushed all the way to the home means that more interactive, multimedia capability may realistically be available to the typical consumer. One safe prediction is that the future promises to be exciting.

Bibliography

Albitz, Paul, and Liu, Cricket. (1997). DNS and BIND.Cambridge, MA: O'Reilly.

Andrews, Paul. (1999). How the Web Was Won. New York: Broadway Books.

Berners-Lee, Tim; Cailliau, Robert; Groff, Jean-Francois; and Pollermann, Bernd. (1992). "The World Wide Web: The Information Universe." Electronic Networking: Research, Applications, and Policy. 1(2):52-58.

Berners-Lee, Tim; Cailliau, Robert; Luotonen, Ari;Nielsen, Henrik F.; and Secret, Arthur. (1994). "The World Wide Web." Communications of the ACM 37(8):76-82.

Bush, Vannevar. (1945). "As We May Think." Atlantic Monthly 176 (July):101-108.

Chapman, A. Lyman. (1993). "The Billion-Node Internet." In Internet System Handbook, eds. Daniel C. Lynch and Marshall T. Rose. Reading, MA: Addison-Wesley.

Clark, Jim. (1999). Netscape Time: The Making of the Billion Dollar Start-Up that Took on Microsoft. New York: St. Martin's Press.

Comer, Douglas E. (1991). Internetworking with TCP/IP, Volume I: Principles, Protocols, and Architecture. Englewood Cliffs, NJ: Prentice-Hall.

Comer, Douglas E. (1997). The Internet Book. Upper Saddle River, NJ: Prentice-Hall.

Cusumano, Michael A., and Yoffie, David B. (1998). Competing on Internet Time: Lessons from Netscape and Its Battle with Microsoft. New York: Free Press.

Engelbart, Douglas C., and English, William K. (1968). "A Research Center for Augmenting Human Intellect." In AFIPS Conference Proceedings of the 1968 Fall Joint Computer Conference, Vol. 33. San Francisco, CA: Thompson Book.

Goldfarb, Charles F., and Prescod, Paul. (1998). The XML Handbook. Upper Saddle River, NJ: Prentice-Hall.

Graham, Ian S. (1997). The HTML Sourcebook, 3rd edition. New York: Wiley.

Internet Engineering Task Force. (2001). "IETF: The Internet Engineering Task Force." <http://www.ietf.org/>.

Internet Society. (2001). "Welcome to ISOC. " <http://www.isoc.org/>.

Internet Software Consortium. (2001). "Internet Domain Survey." <http://www.isc.org/ds/>.

Lehnert, Wendy G. (1998). Internet 101: A Beginner's Guide to the Internet and the World Wide Web. Reading, MA: Addison-Wesley.

Lehnert, Wendy G. (1999). Light on the Internet. Reading, MA: Addison-Wesley.

Lynch, Daniel C. (1993). "Historical Evolution." InInternet System Handbook, eds. Daniel C. Lynch and Marshall T. Rose. Reading, MA: Addison-Wesley.

Metcalfe, Robert. (1995). "From the Ether: Predicting the Internet's Catastrophic Collapse and Ghost Sites Galore in 1996." InfoWorld 17(49):61.

Netcraft Ltd. (2001). "The Netcraft Web Server Survey." <http://www.netcraft.com/survey/>.

Persistent Uniform Resource Locator. (2001). "PURL."<http://www.purl.org/>.

Peterson, Larry L., and Davie, Bruce S. (2000). Computer Networks: A Systems Approach. San Francisco, CA: Morgan Kaufmann.

Petska-Juliussen, Karen, and Juliussen, Egil, eds.(1997). Computer Industry Almanac, 8th edition. Dallas: Computer Industry Almanac, Inc.

Quittner, Joshua, and Slatalla, Michelle. (1998). Speeding the Net: The Inside Story of Netscape and How It Challenged Microsoft. New York: Atlantic Monthly Press.

World Wide Web Consortium (2001). "W3C." <http://www.w3.org/>.

Charles L. Viles

Encyclopedia of Communication and Information