Distributed Systems

views updated

DISTRIBUTED SYSTEMS

The Internet consists of an enormous number of smaller computer networks which are linked together across the globe. No one central computer is responsible for the Internet's performance, or for the sea of available information that people obtain from it every day. Rather, this performance and information is distributed among and affected by millions of individual entities (individuals, companies, and organizations) and devices (routers, servers, workstations, desktop computers, and other pieces of the Internet's infrastructure). In this sense, the Internet is a distributed system.

This same principle applies to smaller computing environments used by companies and individuals who engage in e-commerce. For example, employees at a large company may use a software application to enter customer data into a database. Rather than being directly installed on each user's computer, this software application is more often installed on one server and shared among hundreds or thousands of users via a network. Applications used in such distributed environments often are object-oriented programs, and the parts (objects) they consist of can be located on one or more machines and accessed by many users as needed. Additionally, even though different parts of a program may be located on different machines, to users it appears as if the application were running right from their computer.

The general concept of distributed systems grew in popularity and prominence along with the evolution of computer technology. When companies began using computers for the first time, large mainframe systems, and later minicomputers, were used to solve complicated business problems and perform difficult computing tasks. Computers became tools for performing calculations and analyzing different combinations of variables that were virtually impossible, or which would be too time consuming, for humans to do. Since that time, such operations have become increasingly decentralized and networks of smaller distributed systems, working collectively, have been applied to modern day information processing challenges. As Red Herring explained: "Companies have begun using distributed computing to harness unused computing power to better solve problems. In the United States, there are already 100 million computers connected to the Internet. If you could put them all together, you'd have a computer 3,333 times more powerful than the most powerful computer on earth, IBM's ASCI White, which has the power of around 30,000 desktops."

The SETI Institute is one example of how an organization has used distributed computing in this way. SETI is "an institutional home for scientific and educational projects relevant to the nature, distribution, and prevalence of life in the universe." SETI scans radio frequencies from space in an attempt to discover communications from extraterrestrial life forms. Through the organization's SETI@home program, individuals with computers and Internet access were able to help SETI by loading a special screen saver onto their computers. The screen saver connected to SETI while computers were not being used, retrieved data from SETI, analyzed it for signs of life, and reported back to SETI. The alternative to this approach was for SETI to obtain an expensive supercomputer to perform analyses, which was beyond its financial means.

Distributed computing not only allows otherwise idle computing resources to be used, it also allows data to be distributed more efficiently. For example, companies that offer software or various forms of content downloads on the Internet can address potential traffic overload issues by distributing downloadable data at strategic Internet locations instead of forcing the world's users to download it from only one source. In the early 2000s, Digital Island was one company that specialized in making networks and content delivery faster and more reliable. Digital Island's Footprint network put content geographically closer to individual users on a network of special caches (devices which hold content and make it more readily available to users). In mid-2001 the company announced that it had teamed with Internet services software provider Novell Inc. to make content delivery faster across the Footprint network. This was partially in response to the habits of online users, who generally wait only a handful of seconds to obtain online content before looking elsewhere.

The future of distributed computing looked very bright in the early 2000s. At that time, IBM had obtained what Technology Review described as "an apparently broad patent on a way to broker large computing tasks." This approach was similar to what SETI was doing in that a special screen saver coordinated the use of otherwise idle computing resources at homes or at large companies from a central computer. However, unlike SETI's method, which was applied to a single task, IBM's approach could handle multiple computing tasks. In addition to scientific applications, IBM's approach held potential for business use, such as financial data analysis. Supporting this, ZDNet reported that analysts predicted distributed computing would take off in 2001 as companies realized its value as a way to affordably analyze large amounts of data, as opposed to buying large, expensive supercomputers.