Internet Spider
Encyclopedia of Espionage, Intelligence, and Security
Internet Spider
An Internet spider is a program designed to "crawl" over the World Wide Web, the portion of the Internet most familiar to general users, and retrieve locations of Web pages. It is sometimes referred to as a webcrawler. Many search engines use webcrawlers to obtain links, which are filed away in an index. When a user asks for information on a particular subject, the search engine pulls up pages retrieved by the Internet spider. Without spiders, the vast richness of the Web would be all but inaccessible to most users, rather as the Library of Congress would be if the books were not organized.
Some search engines are human-based, meaning that they rely on humans to submit links and other information, which the search engine categorizes, catalogues, and indexes. Most search engines today use a combination of human and crawler input. Crawler-based engines send out spiders, which are actually computer programs that have sometimes been likened to viruses because of their ability to move between, and insert themselves into, other areas in cyberspace.
Spiders visit Web sites, record the information there, read the meta tags that identify a site according to subjects, and follow the site's links to other pages. Because of the many links between pages, a spider can start at almost any point on the Web and keep moving. Eventually it returns the data gathered on its journey to the search engine's central depository of information, where it is organized and stored. Periodically the crawler will revisit the sites to check for changed information, but until it does so, the material in the search engine's index remains the same. It is for this reason that a search at any time may yield "dead" Web pages, or ones that can no longer be found.
No two search engines are exactly the same, the reason being (among other things) a difference in the choice of algorithm by which the indices are searched. Algorithms can be adjusted to scan for the frequency of certain keywords, and even to circumvent attempts at keyword stuffing or "spamdexing," the insertion of irrelevant search terms intended simply to draw traffic to a site.
█ FURTHER READING:
BOOKS:
Fah-Chun Cheong. Internet Agents: Spiders, Wanderers, Brokers, and 'Bots. Indianapolis, IN: New Riders, 1996.
Sherman, Chris, and Gary Price. The Invisible Web: Uncovering Information Sources Search Engines Can't See. Medford, NJ: Cyber Age Books, 2001.
Young, Gray. The Internet. New York: H. W. Wilson, 1998.
SEE ALSO
Computer Virus
Internet: Dynamic and Static Addresses
Internet Spam and Fraud
Internet Surveillance
Internet Tracking and Tracing
Find more facts and information related to the .
Copyright 2004, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.
Related newspaper, magazine, and trade journal articles from HighBeam Research
(Including press releases, facts, information, and biographies)
For more facts and information,
see all related premium articles
Related entries from encyclopedias, dictionaries, and thesauruses
|
spider
spider A program which wanders around the Internet looking for new resources such as recently...ENGINES although a number of user-driven spiders can be found on SHARE WARE sites. When...search engine the engine often sends a spider to the site to index it and produce data...
Read more
|
|
spider
spider (crawler, Web crawler) An automatic program that searches the Internet, finding new Web sites and producing an index of addresses and content for use in a search engine .
Read more
|
|
Internet Spam and Fraud
...from health and well-being products to pornography. Internet experts assert that nearly 90 percent of the spam mail...marketing companies that use spam. Spam is costly to Internet service providers (ISP) and to consumers in terms of...computers (e.g. open proxies, etc.) attached to the Internet that are ...
Read more
|
|
crawler
crawler Synonymous with SPIDER . A program which accesses the Internet, usually the WORLD WIDE WEB or NEWSGROUPS , gathering information for SEARCH ENGINES .
Read more
|
|
search engine
Tool for finding information, especially on the Internet or World Wide Web . Search engines are essentially massive databases that cover wide swaths of the Internet. Most consist of three parts: at least one program, called a spider, crawler, or bot, which 'crawls' through the Internet gathering ...
Read more
|