Computers and computer networks have changed the way in which people work, play, do business, run organizations and countries, and interact with one another on a personal level. The workplace of the early twentieth century was full of paper, pens, and typewriters. The office of the early twenty-first century is a place of glowing monitor screens, keyboards, mice, scanners, digital cameras, printers, and speech recognition equipment. The office is no longer isolated; it is linked by computer networks to others like it around the world. Computers have had such an effect that some say an information revolution is occurring. This revolution may be as important as the printing revolution of the fifteenth century, the industrial revolution of the nineteenth century, or the agricultural revolutions of the ancient and medieval worlds.
The computer was invented to perform mathematical calculations. It has become a tool for communication, for artistic expression, and for managing the store of human knowledge. Text, photographs, sounds, or moving pictures can all be recorded in the digital form used by computers, so print, photographic, and electronic media are becoming increasingly indistinguishable. As Tim Berners-Lee (1998), developer of the World Wide Web, put it, computers and their networks promise to become the primary medium in which people work and play and socialize, and hopefully, they will also help people understand their world and each other better.
During the last half of the twentieth century, electronic digital computers revolutionized business, learning, and recreation. Computers are now used in newspaper, magazine, and book publishing, and in radio, film, and television production. They guide and operate unmanned space probes, control the flow of telecommunications, and help people manage energy and other resources. They are used to categorize and preserve the store of human knowledge in libraries, archives, and museums. Computer chips called "embedded microprocessors" are found in the control systems of aircraft, automobiles, trains, telephones, medical diagnostic equipment, kitchen utensils, and farm equipment. The effect on society has been so great that digital information itself is now exchanged more rapidly and more extensively than the commodities or manufactured goods it was originally supposed to help manage. Information has become an essential commodity and, some would argue, a necessary social good.
The history of computing is several stories combined. One is a hardware story—a tale of inventions and technologies. Another is a software story—a tale of the operating systems that enabled specific computers to carry out their basic functions and the applications programs designed to deliver services to computer users. A third story tells how computers provide answers to the problems of society, and how they in turn create new possibilities for society.
Computers and the Media
The computer has transformed print journalism and magazine and book production, changing the ways in which stories are researched, written, transmitted to publishers, typeset, and printed. Through computing and telecommunications, a news story breaking in Asia can be sent within seconds to North America, along with digital pictures. Word-processing software and more sophisticated desktop publishing programs allow authors to create and revise documents easily and to check them for spelling, grammar, and readability.
Copies of digital documents can be printed on demand, and because computers check for transmission errors, all the copies will be identical. While the first word-processing programs offered little more than typewriter-style characters, the introduction of graphical user interfaces (GUIs) in the 1980s and 1990s opened new design possibilities. Writers could choose from a variety of type fonts, select different page layouts, and include photographs and charts. Some feared that this might eliminate jobs since tasks performed by authors, editors, typesetters, proofreaders, graphic designers, and layout artists could all be performed by one person with a computer.
Laptop or notebook computers gave writers even more flexibility. A reporter on location could compose a story and transmit it immediately to a newspaper (using a modem and a hotel room telephone) on the other side of the globe and, perhaps, to wire news services such as The Associated Press or the Reuters news agency. Satellite uplinks, cellular phones, and infrared "beaming" between machines provide even more possibilities. Moreover, digital photography eliminates the time taken to develop photographs, and digital pictures can be transmitted as easily as text.
Computers have revolutionized radio, television, and film production as well. Computerized camera switching and special-effects generators, electronic music synthesizers, photographic exposure control, and digital radio and television programming are all examples. Computer graphics can be used to superimpose sports statistics over a picture of a game in progress or allow a commentator to explain a key play by drawing a diagram over a television picture. Computers have made it possible to produce the entire programming lineup of a radio station without relying on tape recorders except for archival materials or for recordings made in the field.
Digital sound editing can eliminate noise, mix voice and music, and give producers second-by-second precision in the assembly of programs. Computerized film processing can provide better quality images or allow images to be converted from color to black-and-white and vice versa. While movie animation has traditionally involved photographing thousands of separately drawn pictures or "cells," computer animation can use fewer drawings and produce thousands of variations. Special effects are much more convincing when the computer handles the lighting, perspective, and movement within the movie scene.
Speech recognition and dictating software can convert voice recordings directly to word-processed text, and translation programs can then rewrite the word-processed text into another human language. Musicians can compose new works at a computer keyboard and create a printed score from the finished version.
Even when an organization's primary medium is print, radio, or television, it has become common to provide more in-depth coverage on an associated website. While some radio and television networks simultaneously broadcast and webcast their programming, perhaps the most powerful potential will be found in ever-growing digital archives. Using search engines and, increasingly, programs called "intelligent agents," users can retrieve items from the archives, print fresh copies, or compare different accounts of the same event.
Most young people probably first use a computer for entertainment. Individual-and multiple-player games, online "chat" rooms, newsgroups, electronic mailing lists, and websites provide computer-mediated education and leisure activities that were never possible before.
At first, computer programmers wrote games to amuse themselves. The classic "dungeons and dragons" game, Adventure, invented by Will Crowther and Bob Woods, was a favorite. Players gave commands such as "go left" or "take lamp," and the computer printed replies such as "OK." There were no pictures. Simple games that used graphics, with names such as Pong and Pacman, became available during the 1970s. As personal computers and handheld games became practical to produce, an entire electronic games industry was born. Nintendo and Sega are two familiar games companies. Computerized video games and lottery ticket machines soon became such popular attractions in shopping malls and corner stores that critics began to warn that they might become addictive.
Computing has changed the way writers research and prepare scientific articles. During the early 1970s, a small number of databases containing "abstracts" (i.e., summaries of scholarly and popular articles) could be searched offline. Users submitted lists of subjects or phrases on coding forms. Keypunchers typed them onto computer cards, and operators processed them on mainframe computers. The answers would be available the next day. Library catalogs were printed on paper cards or computer output microform (COM). A microfiche is a transparent plastic slide, roughly the size of an ordinary index card, but it contains images of many pages of computer output.
The Library of Congress, and national libraries in other countries, had by this time converted most of the descriptions of the books they owned into machine-readable form. Toward the end of the 1970s, research databases and library catalogs were becoming widely available online. The Dialog database, and library services such as the Online Computer Library Center (OCLC), made it possible to search the contents of many journals or the holdings of many libraries at once. Standards such as the Machine-Readable Cataloging format (MARC) made it possible to exchange this information worldwide and to display it on many different types of computers. However, limits on computer disk space, telecommunications capacities, and computer processing power still made it impractical to store the full text of articles.
Because of the costs, researchers working for large institutions were the main users of these services. By the mid-1980s, when microcomputer workstations became widely available and compact disc read only memory (CD-ROM) became a practical distribution method, much research could be conducted without connecting to large central databases. Companies such as EBSCO and Info Trac began licensing CD-ROMs to their subscribers. With better magnetic "hard" disks and faster microcomputer chips, full-text storage and retrieval finally became workable.
By the end of the twentieth century, databases and catalogs could be accessed over the Internet, on CD-ROM, or through dial-up connections. Some of the special databases include ERIC (for educational issues), Medline and Grateful Med (for medical issues), and Inspec (for engineering issues). Legal research was simplified by services such as Lexis and Westlaw, which allowed identification and cross-referencing of U.S. and international statute and case law. In one of the more interesting applications of computing technology, the Institute for Scientific Information in Washington, D.C., introduced its citation indexing services, which allow researchers to discover important authors and issues by revealing which authors quote one another. Some databases are free of charge, and some are available for a fee.
A researcher at a public library, in a television newsroom, or in a medical practice can perform searches against thousands of special databases and millions of sites on the World Wide Web. While this sort of research was possible with printed directories in the past, it was time consuming and labor intensive. However, searching for data electronically can have unexpected results. Because the computer does not really understand what the string of letters "Jim Smith" means, it will faithfully report any occurrence it finds, regardless of the context. Information retrieval theory and informetrics are two fields that study the implications.
The Computer Industry
In the late 1960s, some writers scoffed at the potential of computers. The mainframe machines of the time occupied entire rooms, and only large institutions could afford them. No computer ever conceived, suggested one writer, had ever weighed less than a human being or been capable of performing as many tasks.
Without the transistor and the integrated circuit, computers would still fill large rooms. Without the laser and improved plastics, optical storage media such as CD-ROMs and digital versatile discs (DVDs) would not be possible. Magnetic tapes and disks have also improved greatly over the years and can now store much more information than they could in the past. It is difficult to buy an item in the supermarket or to borrow a book from a library without that item having a barcode label on it. Credit and debit cards with magnetic strips make it easier to access bank accounts and make retail purchases. Inventions such as these are part of the story of computing, although they are often overlooked.
For example, a minicomputer of the mid-1980s could cost about $500,000 and could contain 64 kilobytes (kb) of random access memory (RAM). By the end of the century, a magnetic floppy disk containing 1.4 megabytes (Mb) of memory sold for less than a dollar, a CD-ROM disk that held 650 Mb was less than two dollars, and desktop microcomputers with 64 Mb of RAM were common household items.
As the industry grew, so did the legends of inventors who made fortunes or revolutionized the industry. William R. Hewlett and David Packard started their company in a garage. Graduate students David Filo and Jerry Yang developed the Yahoo! Internet directory in a dormitory room. Steve Jobs of Apple Computer, Bill Gates of Microsoft, and the heads of many other companies in California's Silicon Valley became known around the world.
Computer engineers and programmers have often exchanged their ideas openly, out of scientific duty. The Xerox Corporation hit on the idea of the graphical user interface (GUI), developed the "mouse," and then told everyone how to produce them. Linus Torvalds developed the Linux operating system as a personal project and then made it available for free. Universities also have a long history of developing software and computers and then sharing the knowledge.
The History of Computers
While digital computers are a relatively recent invention, analog devices have existed for thousands of years. The abacus, sometimes considered to be a computer, was used in medieval China and by the Aztecs of Central America, and earlier "counting boards" were found in ancient Babylon. Another analog device, the slide rule, continues to have a following because some engineers still prefer them to electronic calculators. Circular slide rules, called "dead-reckoning computers," were used by aircraft pilots well into the 1970s to perform navigational tasks.
During the Middle Ages, the Franciscan scholar Ramon Llull used circular disks that had letters and numbers (representing terms from philosophy) written on them. By turning the wheels, Llull could come up with new combinations of concepts. Llull's work continued to influence logicians. Gottfried Wilhelm von Leibnitz made it the topic of a treatise, Dissertio de arte combinatoria, in 1666.
During the industrial revolution, mass-production devices such as the Jacquard loom became common. Designs to be woven into cloth could be punched onto the cards that controlled the loom. Charles Babbage, working with Lady Ada Lovelace in the early nineteenth century, first thought of using punched cards to do mathematics. Their Analytical Engine wove numbers into tables the way the loom wove cloth from strands of thread. The modern Ada computer language commemorates their work. Toward the end of the nineteenth century, Herman Hollerith, who founded International Business Machines (IBM), developed the punched cards used in early digital computers.
In a 1936 paper, "On Computable Numbers," the British mathematician Alan Turing first suggested the idea of a general-purpose computing machine. With electronic digital computers, Turing's idea became realizable. Turing and the Hungarian-American mathematician John von Neumann are two of the many pioneers of digital computing. Turing designed machines called, individually, the Bombe and Colossus to break the "Enigma" cypher—a secret code used by Germany during World War II. He also proposed the famous "Turing test" for artificial intelligence. The Turing test suggests that if a person cannot tell the difference between responses from a computer and responses from a human, then the computer must be considered to be "intelligent."
The first generation of electronic computers, which included the Mark 1, the ENIAC, and other machines built with vacuum tubes, were huge, expensive, and apt to fail or "crash." Grace Hopper once repaired the U.S. Navy's Mark II computer by removing a moth from its circuitry. The term "debugging" is often associated with this incident.
The transistor made it possible to produce computers in quantity. However, mainframe computers such as the IBM 370 were still huge by modern standards, and only universities, government agencies, or large companies could afford them. By the 1980s, with integrated circuits, a new generation of minicomputers was born. Digital Equipment Corporation (later Compaq), Hewlett-Packard, and Data General were some of the key manufacturers. These machines were about the size of a refrigerator.
By the end of the 1970s, desktop microcomputers began appearing in smaller offices and in ordinary people's homes. Beginning with the Osborne, the Commodore 64, the Apple, and the IBM PC, microcomputers and their software systems came to dominate the market. These machines used microcomputer chips—room-sized central processing units shrunk to less than the size of a penny. The Intel 8080 and the Motorola 6800 were among the very first such chips, appearing in the latter half of the 1970s. Many programmers joked about these new "toys." During the next decade, microcomputers would grow into powerful workstations—powered by chips from Intel and Motorola and built by companies such as Sun Microsystems, IBM, Apple, Dell, Toshiba, Sony, and Gateway, to name just a few.
Computing involves three activities: input, process, and output. Data enters the computer through a keyboard or mouse, from a camera, or from a file previously recorded on a disk. A program or "process" manipulates the data and then outputs it to a screen, printer, disk, or communications line.
Over the years, many different input devices have been used, including punched paper tape, punched cards, keyboards, mice, microphones, touch-screens, and video cameras. Output devices have included paper printouts, teletypewriters, and video monitors. The part of the computer that does the processing is known as the central processing unit (CPU). Collectively, everything other than the CPU, including memory boards, disks, printers, keyboards, mice, and screens can be thought of as peripheral devices, or just "peripherals."
There are two sorts of computer software. Operating systems, such as Microsoft Windows, Macintosh, or UNIX, allow machines to perform their basic functions—accepting input, running programs, and sending output to users. Applications programs, such as word processors, Internet browsers, electronic mail programs, or database management programs, do the work required by computer users.
Digital computers use data that has been encoded as series of zeros and ones—binary digits or bits. Text, images, sounds, motion pictures, and other media can all be represented as strings of zeros and ones and processed by digital computers. Programs—the instructions on how to manipulate data—also are represented in binary form. The earliest digital computers were designed to store and manipulate the numbers and letters of the alphabet that were found on typewriter keyboards. The American Standard Code for Information Interchange (ASCII) uses 128 combinations of bits to represent the letters, numbers, and symbols on a typewriter keyboard. Plain text worked well when computers were used primarily for mathematics.
Binary numbers can represent visual and audio information as well. By the end of the 1980s, designers had expanded the coding systems to store drawings, photographs, sounds, and moving pictures. Each dot on a screen is called a "picture element" (or "pixel"). To display graphics on the screen, computers use groups of binary numbers—ones and zeros—to represent the color, intensity of light, and position of each pixel.
Modern computers almost always use some type of GUI. Programmers use small graphics called "icons" to represent a program, a document, a movie, or a musical work. When a user selects an icon, the computer can open a file or program that is associated with it. This technique is object-oriented programming.
When the price of computers dropped, it became possible to distribute work among several machines on a network instead of using a large central computer. A piece of software called a "server" could now send information to smaller programs called "clients" located at the workstations. Shared files remain on large computers called "file servers," so several users can access them at once. Internet browsers, such as Netscape and Internet Explorer, are good examples of "client/server" design at work, where the browser is a client and an Internet site hosts the server software and the large files of information.
There are many programming languages, each better at addressing certain types of problems. The Formula Translation language (FORTRAN) was developed to handle scientific problems. The Beginner's All-purpose Symbolic Interchange Code (BASIC) and the Common Business-Oriented Language (COBOL) were better for office automation. The languages C, C++, Java, and Visual Basic use libraries of small, interchangeable programs that perform frequently required tasks, such as sorting items or displaying them on a screen. Programmers can combine these small programs into more complex systems, allowing programmers to build new applications quickly. Other languages, such as Prolog and LISP, were invented for work in artificial intelligence, while Ada was designed to address military needs.
Once personal computers were available, the demand for special software packages or "applications" increased. Spreadsheets, such as the early Super Calc and Excel, have simplified accounting and statistical processes, and they allow users to try out various financial scenarios. If the costs or quantities of items change, the results will appear immediately on the screen. A whole range of database management packages, including dBase, Fox-Pro, Oracle, and Access, help users do inventories, maintain customer profiles, and more. Because records in databases can be matched against ones in different files, say a customer demographic file with a warehouse inventory file, businesses can predict supply and demand trends and improve the delivery of goods and services. Geographic information systems, online census data, and telephone directories make it easier to market products in areas where there is demand. Some critics argue that using data for reasons other than those for which it was collected is an invasion of privacy. In many countries, freedom of information and privacy protection laws have been passed to address these issues.
Computing and Knowledge
Computers have changed the world in which people live and work, and they have provided new ways of thinking about, and making sense of, that world. At the beginning of the twenty-first century, computer science is a mature academic discipline, with almost every university or college offering computer courses.
As an academic subject, computer science may involve information theory, systems analysis, software engineering, electrical engineering, programming, and information studies that examine the use of digital information. The founders of information theory, Claude Shannon and Warren Weaver, published The Mathematical Theory of Communication in 1949. The mathematician Nor-bert Wiener, who coined the term "cybernetics," showed how computing theories could be applied to problems of communication and control in both animals and machines. Ludwig von Bertalanffy founded general system theory because he saw that large complex systems did not necessarily behave in the same what that their individual components did. He is considered one of the founders of systems analysis.
Professional associations have also played important roles in the development of computing theory, practice, and standards. The Association for Computing Machinery, the Institute of Electrical and Electronic Engineers, the International Standards Organization, and the W3 Consortium are all agencies concerned with computing methods and standards. Less widely known groups, such as the International Society for Systems Sciences and Computer Professionals for Social Responsibility, concern themselves with professional ethics and the social effect of computing. Computing has its own journals and magazines that are aimed at special groups of professionals and at consumers.
Modern computing researchers come from many backgrounds. In turn, scholars from other areas apply computing theory and systems analysis to their own disciplines—from philosophy to psychology to social work. Centers such as the Media Lab at the Massachusetts Institute of Technology or the Xerox Corporation's Palo Alto Researcher Center bring together experts from many fields to design "neural networks" that simulate the human brain, to build smaller and faster machines, or to find better ways of managing digital information. Nicholas Negroponte, Marvin Minsky, and their colleagues at the Media Lab are associated with developments in artificial intelligence and robotics.
Some people fear that while computers relieve humans of repetitive tasks, they may also "de-skill" workers who forget how to do such tasks by hand. Others suggest that having to cope with computers on the job adds extra stress, raises expectations of promptness, and requires ongoing retraining of workers. Because computing has made it possible to recombine and repackage stories, pictures, and sounds, some fear that the work of authors may one day be regarded as interchangeable, much like mechanical parts. In addition, as people depend more on computers, they become more vulnerable to system failure. If the world's computers should fail all at once, economic and social chaos might result. A series of Internet "worms" and "viruses" heightened concern over society's dependence on computers during 1999 and 2000. Governments, banks, companies, and individuals worried that the clocks in their computers might fail at the beginning of 2000, but the "Y2K" crisis they feared did not occur.
Computer designers and computer users think about computers in different terms, and they use different jargon. Hackers, who explore aspects of computers that designers could not have foreseen, have their own way of looking at and talking about computers. People who use computers for destructive purposes are more properly called "crackers." Finally, those people who do not have access to computers run the risk of economic and educational hardships.
The Internet and the Future
During the early 1980s, the Defense Advanced Research Projects Agency (DARPA)—the central research and development organization for the U.S. Department of Defense—commissioned work on a standard design for its wide area networks, computer connections that could link entire countries or continents. In response, communications standards called the Transmission Control Protocol and the Internet Protocol were published in 1981.
Many computer networks, with names such as Decnet, Usenet, and Bitnet, were already in operation, but within about a decade, the new standards were adopted around the world. At first, because there were no graphics, the Internet was used for electronic mail and discussions and for text-only directory services such as Gopher (from the University of Minnesota) and WAIS (wide area information service). Then Berners-Lee and his colleagues at CERN, the European nuclear research center in Switzerland, came up with a new set of protocols that could be used to mix pictures and sounds with text and let users locate any document on any network computer anywhere in the world. The result was the World Wide Web.
Briefly, this is how the web works. Every computer on the Internet has a numeric Internet Protocol (IP) address, which looks like four groups of numbers separated by periods. Because humans would have trouble with addresses such as 123.12.345.1, websites also have "domain names," such as "wayne.edu" or "acme.com," which are easier to understand. Scattered around the world, domain name servers (DNSs) provide large telephone-directory style lists, which map the names to the numbers.
Every item on the web, whether a file of text, a picture, or a sound, can be found and retrieved by its uniform resource locator (URL). A URL contains the domain name of the computer on which the item is stored and, optionally, additional information about the file folders and file names on that computer. Documents on the web, called "pages," are written in the Hyper Text Markup Language (HTML) and exchanged using the HyperText Transmission Protocol (HTTP).
Berners-Lee (1998) believes that once most of human knowledge is made available over the Internet, and once the Internet becomes the primary way in which individuals communicate with one another, humans will have the wisdom to use computers to help analyze society and to improve it.
While the promise is bright, the Internet presents many challenges for information scientists. While URLs provide a way of locating individual documents anywhere on the network, the web is always in flux, and URLs are quite "volatile" or apt to change from day to day or even from minute to minute. In addition, because material on the web may look highly polished, it is sometimes hard for users to distinguish reliable information from unreliable information. Metadata—data about data—is one of the schemes proposed to reduce confusion. Metadata tags are similar to subject, author, and title entries in a library catalog, and can be written at the top of a web document.
Increasingly, the computer network is the medium through which scientists assemble and exchange knowledge from many sources and train future generations. The Human Genome Project and simulations to train surgeons or aircraft pilots are examples. Many scholars publish directly to the Internet by posting their discoveries to the World Wide Web, newsgroups, or mailing lists. This speeds the process of information exchange, but since such works are not examined by editors, it also increases the chances of error and makes it harder for readers to determine whether the information is reliable. The need to be able to index and describe web-pages has led to the development of metadata as a way of categorizing electronic documents. However, with millions of authors publishing to the web, the task of indexing and describing their work is staggering.
Computers continue to become smaller, less expensive, more powerful, and more essential to society. So far, dire predictions of de-skilled workers or massive unemployment due to an increased use of computers in the workplace have yet to materialize. In the future, computers will be still smaller and many times more powerful as engineers find ways to use nanotechnology to build microscopic machines. Some people predict that computers will eventually use individual molecules, or even subatomic particles, to store and manipulate the ones and zeros that make up digital information.
By building microprocessors into cars, aircraft, and even household devices such as microwave ovens, designers have produced a raft of "smart" devices. Steve Mann and his colleagues at MIT and the University of Toronto have even developed smart clothing, which can detect signs of sudden illness in the wearer. Increasingly, computers will be able to assist people with disabilities. Smart cars and smart houses have obvious social benefits. However, the same technologies can be used to produce smart weapons. Sensors in a smart office can prevent burglaries or announce guests. They can also monitor employees, minute by minute. Will ubiquitous computers have positive or negative effects on society? This is a question for which only the future can provide an answer.
See also:Artificial Intelligence; Computer Software; Computer Software, Educational; Databases, Electronic; Diffusion of Innovations and Communication; Digital Communication; Digital Media Systems; Geographic Information Systems; Internet and the World Wide Web; Knowledge Management; Libraries, Digital; Library Automation; Privacy and Encryption; Ratings for Video Games, Software, and the Internet; Retrieval of Information; Standards and Information; Technology, Adoption and Diffusion of; Webmasters.
Berners-Lee, Tim. (1998). "The World Wide Web: AVery Short Personal History." <http://www.w3.org/People/Berners-Lee/ShortHistory.html>.
Bertalanffy, Ludwig von. (1976). General System Theory, Foundations, Development, Applications. New York: G. Braziller.
Biermann, Alan W. (1997). Great Ideas in Computer Science: A Gentle Introduction, 2nd edition. Cambridge, MA: MIT Press.
Brookshear, J. Glenn. (1999). Computer Science: An Overview. New York: Addison-Wesley.
Carlson, Tom. (2001). "The Obsolete Computer Museum." <http://www.obsoletecomputermuseum.org>.
Gardner, Martin. (1982). Logic Machines and Diagrams. Chicago: University of Chicago Press.
Hiltz, Starr Roxanne, and Turoff, Murray. (1993). The Network Nation: Human Communication Via Computer, revised edition. Cambridge, MA: MIT Press.
Kidder, Tracy. (1997). The Soul of a New Machine. New York: Modern Library.
Negroponte, Nicholas. (1995). Being Digital. New York:Vintage Books.
Raymond, Eric S. (1998). The New Hacker's Dictionary. Cambridge, MA: MIT Press.
Shannon, Claude, and Weaver, Warren. (1949). The Mathematical Theory of Communication. Urbana: University of Illinois Press.
Sudkamp, Thomas A. (1996). Languages and Machines: An Introduction to Theory of Computer Science. New York: Addison-Wesley.
Turing, Alan M. (1936). "On Computable Numbers:With an Application to the Entscheidungsproblem." Proceedings of the London Mathematical Society (2nd series) 42:230-265.
Valovic, Thomas. (2000). Digital Mythologies: The Hidden Complexities of the Internet. New Brunswick, NJ: Rutgers University Press.
Wiener, Norbert. (1965). Cybernetics; Or, Control and Communication in the Animal and the Machine, 2nd edition. Cambridge, MA: MIT Press.
Terri L. Lyons
computer, device capable of performing a series of arithmetic or logical operations. A computer is distinguished from a calculating machine, such as an electronic calculator, by being able to store a computer program (so that it can repeat its operations and make logical decisions), by the number and complexity of the operations it can perform, and by its ability to process, store, and retrieve data without human intervention. Computers developed along two separate engineering paths, producing two distinct types of computer—analog and digital. An analog computer operates on continuously varying data; a digital computer performs operations on discrete data.
Computers are categorized by both size and the number of people who can use them concurrently. Supercomputers are sophisticated machines designed to perform complex calculations at maximum speed; they are used to model very large dynamic systems, such as weather patterns. Mainframes, the largest and most powerful general-purpose systems, are designed to meet the computing needs of a large organization by serving hundreds of computer terminals at the same time. Minicomputers, though somewhat smaller, also are multiuser computers, intended to meet the needs of a small company by serving up to a hundred terminals. Microcomputers, computers powered by a microprocessor, are subdivided into personal computers and workstations, the latter typically incorporating RISC processors. Although microcomputers were originally single-user computers, the distinction between them and minicomputers has blurred as microprocessors have become more powerful. Linking multiple microcomputers together through a local area network or joining multiple microprocessors together in a parallel-processing system has enabled smaller systems to perform tasks once reserved for mainframes, and the techniques of grid computing have enabled computer scientists to utilize the unemployed processing power of computers connected over a network or the Internet.
Advances in the technology of integrated circuits have spurred the development of smaller and more powerful general-purpose digital computers. Not only has this reduced the size of the large, multi-user mainframe computers—which in their early years were large enough to walk through—to that of pieces of furniture, but it has also made possible powerful, single-user personal computers and workstations that can sit on a desktop or be easily carried. These, because of their relatively low cost and versatility, have replaced typewriters in the workplace and rendered the analog computer inefficient. The reduced size of computer components has also led to the development of thin, lightweight notebook computers and even smaller computer tablets and smartphones that have much more computing and storage capacity than that of the desktop computers that were available in the early 1990s.
An analog computer represents data as physical quantities and operates on the data by manipulating the quantities. It is designed to process data in which the variable quantities vary continuously (see analog circuit); it translates the relationships between the variables of a problem into analogous relationships between electrical quantities, such as current and voltage, and solves the original problem by solving the equivalent problem, or analog, that is set up in its electrical circuits. Because of this feature, analog computers were especially useful in the simulation and evaluation of dynamic situations, such as the flight of a space capsule or the changing weather patterns over a certain area. The key component of the analog computer is the operational amplifier, and the computer's capacity is determined by the number of amplifiers it contains. Although analog computers are commonly found in such forms as speedometers and watt-hour meters, they largely have been made obsolete for general-purpose mathematical computations and data storage by digital computers.
A digital computer is designed to process data in numerical form (see digital circuit); its circuits perform directly the mathematical operations of addition, subtraction, multiplication, and division. The numbers operated on by a digital computer are expressed in the binary system; binary digits, or bits, are 0 and 1, so that 0, 1, 10, 11, 100, 101, etc., correspond to 0, 1, 2, 3, 4, 5, etc. Binary digits are easily expressed in the computer circuitry by the presence (1) or absence (0) of a current or voltage. A series of eight consecutive bits is called a "byte" ; the eight-bit byte permits 256 different "on-off" combinations. Each byte can thus represent one of up to 256 alphanumeric characters, and such an arrangement is called a "single-byte character set" (SBCS); the de facto standard for this representation is the extended ASCII character set. Some languages, such as Japanese, Chinese, and Korean, require more than 256 unique symbols. The use of two bytes, or 16 bits, for each symbol, however, permits the representation of up to 65,536 characters or ideographs. Such an arrangement is called a "double-byte character set" (DBCS); Unicode is the international standard for such a character set. One or more bytes, depending on the computer's architecture, is sometimes called a digital word; it may specify not only the magnitude of the number in question, but also its sign (positive or negative), and may also contain redundant bits that allow automatic detection, and in some cases correction, of certain errors (see code; information theory). A digital computer can store the results of its calculations for later use, can compare results with other data, and on the basis of such comparisons can change the series of operations it performs. Digital computers are now used for a wide range of personal, business, scientific, and government purposes, from electronic games, e-mail, social networking, and data- and word-processing applications to desktop publishing, video conferencing, weather forecasting, simulated nuclear weapons testing, cryptography, and many other purposes.
Processing of Data
The operations of a digital computer are carried out by logic circuits, which are digital circuits whose single output is determined by the conditions of the inputs, usually two or more. The various circuits processing data in the computer's interior must operate in a highly synchronized manner; this is accomplished by controlling them with a very stable oscillator, which acts as the computer's "clock." Typical personal computer clock rates now range from several hundred million cycles per second to several billion. Operating at these speeds, digital computer circuits are capable of performing hundred of billions of of arithmetic or logic operations per second, but supercomputers are capable of performing more than 1 million times faster; such speeds permit the rapid solution of problems that would be impossible for a human to solve by hand. In addition to the arithmetic and logic circuitry and a number of registers (storage locations that can be accessed faster than main storage, or memory, and are used to hold the intermediate results of calculations), the heart of the computer—called the central processing unit, or CPU—contains the circuitry that decodes the set of instructions, or program, and causes it to be executed.
Storage and Retrieval of Data
Associated with the CPU is the main storage, or memory, where results or other data are stored for periods of time ranging from a small fraction of a second to days or weeks before being retrieved for further processing. Once made up of vacuum tubes and later of small doughnut-shaped ferromagnetic cores strung on a wire matrix, main storage now consists of integrated circuits, each of may contain billions of semiconductor devices. Where each vacuum tube or core represented one bit and the total memory of the computer was measured in thousands of bytes (or kilobytes, KB), modern computer memory chips represent hundreds of millions of bytes (or megabytes, MB) and the total memory of both personal and mainframe computers is measured in billions of bytes (gigabytes, GB) or more. Read-only memory (ROM), which cannot be written to, maintains its content at all times and is used to store the computer's control information. Random-access memory (RAM), which both can be read from and written to, is lost each time the computer is turned off. Modern computers now include cache memory, which the CPU can access faster than RAM but slower than the registers; data in cache memory also is lost when the computer is turned off.
Programs and data that are not currently being used in main storage can be saved on auxiliary or secondary storage. Although punched paper tape and punched cards once served this purpose, the major materials used today are magnetic tape and disks and flash memory devices, all of which can be read from and written to, and two types of optical disks, the compact disc (CD) and its successor the digital versatile disc (DVD). When compared to RAM, these are less expensive (though flash memory is more expensive than the other two), are not volatile (i.e., data is not lost when the power to the computer is shut off), and can provide a convenient way to transfer data from one computer to another. Thus operating instructions or data output from one computer can be stored and be used later either by the same computer or another.
In a system using magnetic tape the information is stored by a specially designed tape recorder somewhat similar to one used for recording sound. Magnetic tape is now largely used for offsite storage of large volumes of data or major systems backups. In magnetic and optical disk systems the principle is the same; the magnetic or optical medium lies in a path, or track, on the surface of a disk. The disk drive also contains a motor to spin the disk and a magnetic or optical head or heads to read and write the data to the disk. Drives take several forms, the most significant difference being whether the disk can be removed from the drive assembly. Flash memory devices, such as USB flash drives, flash memory cards, and solid-state drives, use nonvolatile memory that can be erased and reprogrammed in blocks.
Removable magnetic disks made of mylar enclosed in a plastic holder (older versions had paper holders) are now largely outdated. These floppy disks have varying capacities, with very high density disks holding 250 MB—more than enough to contain a dozen books the size of Tolstoy's Anna Karenina. Internal and external magnetic hard disks, or hard drives, are made of metal and arranged in spaced layers. They can hold vastly more data than floppies or optical disks, and can read and write data much faster than floppies. As hard disks dropped in price, they became increasingly included as a component of personal computers and replaced floppy disks as the standard media for the storage of operating systems, programs, and data.
Compact discs can hold hundreds of megabytes, and have been used, for example, to store the information contained in an entire multivolume encyclopedia or set of reference works. DVD is an improved optical storage technology capable of storing as much as ten times the data that CD technology can store. CD–Read-Only Memory (CD-ROM) and DVD–Read-Only Memory (DVD-ROM) disks can only be read—the disks are impressed with data at the factory but once written cannot be erased and rewritten with new data. The latter part of the 1990s saw the introduction of new optical storage technologies: CD-Recordable (CD-R) and DVD-Recordable (DVD-R, DVD+R), optical disks that can be written to by the computer to create a CD-ROM or DVD-ROM, but can be written to only once; and CD-ReWritable (CD-RW), DVD-ReWritable (DVD-RW and DVD+RW), and DVD–Random Access Memory (DVD-RAM), disks that can be written to multiple times.
Flash memory devices, a still more recent development, are an outgrowth of electrically erasible programmable read-only memory. Although more expensive than magnetic and optical storage technologies, flash memory can be read and written to much faster, permitting shorter boot times and quicker data access and storage. Because flash memory also is resistant to mechanical shock and has become increasingly compact, a USB flash drive allows for the easy, portable external storage of large quantities of data. Solid-state drives are more easily accessed and written to than magnetic hard drives and use less power, and have become common in high-end, lightweight notebook computers and in high-performance computers. Flash memory is also used in computer tablets and smartphones. Hybrid drives, which combine a smaller amount of flash memory with a large magnetic hard drive, permit the economical storage of large amounts of data while benefitting from a more responsive access to frequently used but only occasionally changed operating system and program files.
Data are entered into the computer and the processed data made available via input/output devices, also called peripherals. All auxiliary storage devices are used as input/output devices. For many years, the most popular input/output medium was the punched card. The most popular input devices are the computer terminal and internal magnetic hard drives, and the most popular output devices are the computer display screen associated with a terminal (typically displaying output that has been processed by a graphics processing unit) and the printer. Human beings can directly communicate with the computer through computer terminals, entering instructions and data by means of keyboards much like the ones on typewriters, by using a pointing device such as a mouse, trackball, or touchpad, or by speaking into a microphone that is connected to computer running voice-recognition software. The result of the input may be displayed on a liquid-crystal, light-emitting diode, or cathode-ray tube screen or on printer output. Another important input/output device in modern computers is the network card, which allows the computer to connect to a computer network and the Internet using a wired or radio (wireless) connection. The CPU, main storage, auxiliary storage, and input/output devices collectively make up a cumputer system.
Sharing the Computer's Resources
Generally, the slowest operations that a computer must perform are those of transferring data, particularly when data is received from or delivered to a human being. The computer's central processor is idle for much of this period, and so two similar techniques are used to use its power more fully.
Time sharing, used on large computers, allows several users at different terminals to use a single computer at the same time. The computer performs part of a task for one user, then suspends that task to do part of another for another user, and so on. Each user only has the computer's use for a fraction of the time, but the task switching is so rapid that most users are not aware of it. Most of the tens of millions of computers in the world are stand-alone, single-user devices known variously as personal computers or workstations. For them, multitasking involves the same type of switching, but for a single user. This permits a user, for example, to have one file printed and another uploaded to an Internet website while editing a third in a word-processing session and listening to a recording streamed over the Internet. Personal computers can also be linked together in a network, where each computer is connected to others, usually by network, coaxial, or fiber-optic cable or by radio signals (wireless), permitting all to share resources such as printers, hard-disk storage devices, and an Internet connection. Cloud computing is another form of resource sharing. Delivering access to both hardware and software over a network, most often the Internet, cloud computing is designed to allow many individuals and organizations using a wide range of devices both ease of access to computing resources and flexibility in changing the type and volume of the resources to which they have access.
Computer Programs and Programming Languages
Before a computer can be used for a given purpose, it must first be programmed, that is, prepared for use by loading a set of instructions, or program. The various programs by which a computer controls aspects of its operations, such as those for translating data from one form to another, are known as software, as contrasted with hardware, which is the physical equipment comprising the installation. In most computers the moment-to-moment control of the machine resides in a special software program called an operating system, or supervisor. Other forms of software include assemblers and compilers for programming languages and applications for business and home use (see computer program). Software is of great importance; the usefulness of a highly sophisticated array of hardware can be limited by the lack of adequate software.
Each instruction in the program may be a simple, single step, telling the computer to perform some arithmetic operation, to read the data from some given location in the memory, to compare two numbers, or to take some other action. The program is entered into the computer's memory exactly as if it were data, and on activation, the machine is directed to treat this material in the memory as instructions. Other data may then be read in and the computer can carry out the program to complete the particular task.
Since computers are designed to operate with binary numbers, all data and instructions must be represented in this form; the machine language, in which the computer operates internally, consists of the various binary codes that define instructions together with the formats in which the instructions are written. Since it is time-consuming and tedious for a programmer to work in actual machine language, a programming language, or high-level language, designed for the programmer's convenience, is used for the writing of most programs. The computer is programmed to translate this high-level language into machine language and then solve the original problem for which the program was written. Many high-level programming languages are now universal, varying little from machine to machine.
Development of Computers
Although the development of digital computers is rooted in the abacus and early mechanical calculating devices, Charles Babbage is credited with the design of the first modern computer, the "analytical engine," during the 1830s. Vannevar Bush built a mechanically operated device, called a differential analyzer, in 1930; it was the first general-purpose analog computer. John Atanasoff constructed the first electronic digital computing device in 1939; a full-scale version of the prototype was completed in 1942 at Iowa State College (now Iowa State Univ.). In 1943 Conrad Zuse built the Z3, a fully operational electromechanical computer.
During World War II, the Colossus was developed for British codebreakers; it was the first programmable electronic digital computer. The Mark I, or Automatic Sequence Controlled Calculator, completed in 1944 at Harvard by Howard Aiken, was the first machine to execute long calculations automatically, while the first all-purpose electronic digital computer, ENIAC (Electronic Numerical Integrator And Calculator), which used thousands of vacuum tubes, was completed in 1946 at the Univ. of Pennsylvania. UNIVAC (UNIVersal Automatic Computer) became (1951) the first computer to handle both numeric and alphabetic data with equal facility; intended for business and government use, this was the first widely sold commercial computer.
First-generation computers were supplanted by the transistorized computers (see transistor) of the late 1950s and early 60s, second-generation machines that were smaller, used less power, and could perform a million operations per second. They, in turn, were replaced by the third-generation integrated-circuit machines of the mid-1960s and 1970s that were even smaller and were far more reliable. The 1970s, 80s, and 90s were characterized by the development of the microprocessor and the evolution of increasingly smaller but powerful computers, such as the personal computer and personal digital assistant (PDA), which ushered in a period of rapid growth in the computer industry.
The World Wide Web was unveiled in 1990, and with the development of graphical web browser programs in succeeding years the Web and the Internet spurred the growth of general purpose home computing and the use of computing devices as a means of social interaction. Smartphones, which integrate a range of computer software with a cellular telephone that now typically has a touchscreen interface, date to 2000 when a PDA was combined with a cellphone. Although computer tablets date to the 1990s, they only succeeded commercially in 2010 with the introduction of Apple's iPad, which built on software developed for smartphones. The increasing screen size on some smartphones has made them the equivalent of smaller computer tablets, leading some to call them phablets.
See S. G. Nash, A History of Scientific Computing (1990); D. I. A. Cohen, Introduction to Computer Theory (2d ed. 1996); P. Norton, Peter Norton's Introduction to Computers (2d ed. 1996); A. W. Biermann, Great Ideas in Computer Science: A Gentle Introduction (2d ed. 1997); R. L. Oakman, The Computer Triangle: Hardware, Software, People (2d ed. 1997); R. Maran, Computers Simplified (4th ed. 1998); A. S. Tanenbaum and J. R. Goodman. Structured Computer Organization (4th ed. 1998).
NatureThe present-day computer derives from British work during the Second World War on cryptographic machines and is the most recent in a line of calculating devices that includes the abacus, the Jacquard loom, Babbage's Analytical Engine, and Hollerith's tab-sorter. Its primary purpose has been to compute, not to compile or converse. There are two kinds of computer: analog and digital. Analog computers, which are related to the slide rule and tables of logarithms (and virtually obsolete), use the strengths of voltages to represent the size of numbers, whereas digital computers use electrical signals only in the on/off form. Currently, digital computers consist of four major parts: (1) A processor or central processing unit (CPU), which executes commands, performing arithmetical, logical, and manipulative operations on the data stored in the second part. (2) A memory, the information store. Most computers have at least two kinds of memory: primary and secondary. Primary memory is usually silicon chips, typically DRAM (dynamic random access memory) chips. ‘Random access’ means that any part may be obtained immediately, as with a book that can be opened to any page. The process is fast, usually less than one microsecond to obtain an item of information. Secondary memory is usually magnetic disk, made of one or more platters rotating under a reading head. It is not random access: a particular part of the disk cannot be read until it rotates under the reading head, which usually takes several milliseconds. Storage is measured in bytes, one byte containing eight bits, and representing storage for one character in European alphabets. See ASCII. (3) Input/output equipment, which enables the user to get information into and out of the machine. The information is entered most commonly through a keyboard but also through removable disks, tapes, and other devices. Output goes to display screens, to printers (which produce text etc., usually known as hard copy), and also to disks and tapes. (4) Communications equipment, which permits a computer to ‘talk’ to other machines and to people located at a distance from it. The equipment includes a modem (an acronym for ‘modulator demodulators’), which connect computers by telephone line, and networks to let machines talk at high speed to each other, as for example in using the INTERNET and the WORLD-WIDE WEB.
Computer programsSince computers work very fast, they cannot be directed step by step. Instead, a script must first be written for the computer to follow. The script typically contains sequences to be repeated, so that the script is much shorter than the operation as executed. The computer responds to machine language, which is binary code (strings of 0s and 1s), in which the operations are very simple (such as elementary arithmetic or moving one piece of data from one place to another). Such scripts are written in higher-level languages called computer programs (BrE following AmE in this spelling, but AmE follows BrE in doubling the m in programming). A distinction is now universally made between the equipment as hardware and software, the latter now generally made available as commercial software packages.
Computer languagesAlso programming languages, high-level languages. Digital computers can follow directions written in a great variety of artificial languages that provide precise specifications of operations to be done and the order in which they must be done. Although strings of letters are used to name commands in these languages, they are quite different from natural language. Among other things, they must be logical and unambiguous: unlike people, computers do not know that the and in I like bread and jam means ‘both together’, while the and in I like cats and dogs does not imply that both must be present at once (= ‘I like cats and I like dogs’). Compared with natural language, high-level computer languages normally have: (1) Very short words: most programmers save effort by giving variables names such as x, one or two letters long, and by using many abbreviations, such as del for delete. (2) Very short utterances: written English sentences might average 20 words in length, but statements in programming language are typically only six items long. (3) Little syntactic variety: the typical computer language at present has a grammar of about 100 rules, compared with thousands in a formal grammatical description of English.
Specific languagesThe many programming languages are divided into business languages (verbose, emphasizing simple operations on complex data) and scientific languages (terse, emphasizing complex operations on simple data). They often have distinctive histories and functions, and names of etymological interest. ALGOL, a language suitable for expressing algorithms, is the computational equivalent of Esperanto, created in 1960 by an international committee. Its name, a reduction of Algorithm Language, is a homonym of the star Algol (Arabic, ‘the ghoul’). BASIC is short for Beginner's All-Purpose Symbolic Instruction Code, designed at Dartmouth College in New Hampshire in 1965 by J. Kemeny and T. Kurtz. It is often the first programming language learned and is similar to the Basic of BASIC ENGLISH, also an acronym. ADA was designed in a competition run by the US Department of Defense from 1974 to 1980, going through successive refinements with such names as Strawman, Woodenman, Tinman, Ironman. The French computer scientist Jean Ichbiah led the winning team. It was named after Lady Ada Lovelace, daughter of the poet Byron and a supporter of Charles Babbage, the inventor of the Analytical Engine, an early mechanical digital computer. She is often called the first programmer. For some years, the goal of ‘programming in English’ (that is, using a more or less unrestricted subset of the natural language) attracted attention, but it has so far proved unattainable.
Processing textComputers, among other things, are extensions of writing and print systems, and have therefore been used with greater or less success to do such things as evaluate, index, parse, translate, correct, and ‘understand’ text. When a suitably programmed computer is fed English, it can process it at several levels, but with decreasing competence as the task becomes more complex. The following sequence is typical:
1. The character level.Text can be entered into a computer by three means: keying it, typically into a word processor which will format the text (arranging the line lengths and character positions); scanning it, using a machine which transfers a paper version into an image followed by a program that seeks to recognize the characters in it; transferring it electronically, typically by diskette or telephone, from another compatible computer. Transfer is the fastest and most accurate method, but currently the least used. When a cleanly typed or printed original is available, without too many fonts or typographic complexities, scanning is faster and easier than rekeying. Once the text is entered, computers can print it in a wide variety of typefaces, sizes, and page formats, using either a printer or a desktop publishing system.
2. The word level.A spelling checker can find some kinds of typing mistakes, usually by comparing words with a dictionary list and noting those that are not in that list. Programs can make word lists and concordances (lists of each word with some context before and after it). By noting the most frequent words in a document, and comparing the word frequencies in a particular text with the average word frequencies in English, a program can suggest words that might be used for indexing the document. The counting of relative word frequencies and comparison with word frequencies from a standard sample can also help in guessing the authorship of anonymous works or measuring the readability level of a text.
3. The sentence level.On the level of syntax, PARSING programs can try to define the structure of sentences and relationships among words. This is typically done by applying grammar rules of the form ‘a verb phrase may be a verb followed by an adverb’. Unfortunately many sentences are ambiguous. In the preceding sentence, a computer would not know whether Unfortunately modified the verb (implying that it is sad that ambiguous sentences occur) or the adjective many (suggesting disappointment that ambiguous sentences are so frequent). Adding a comma after Unfortunately could, however, serve as a means of disambiguation. However, some kinds of grammatical and stylistic errors can be diagnosed, and grammar checkers and style checkers have become available to help in the writing of business letters and the propagation of PLAIN ENGLISH.
4. The message level.At the level of word-and-sentence meaning, semantic analysis can map a sentence into a knowledge-presentation language. Some research projects have been able to take such sentences as Which ships are in port? and answer them by looking at a table of ship locations, but such systems currently operate in strictly limited subject areas. Other applications of semantics include machine translation and direct generation of language by computers (that is, the computer produces text without human input).
The above levels of activity depend on computational linguists writing rules of analysis, accumulating a GRAMMAR of syntactic and/or semantic rules for such a language as English. An alternative strategy for processing written language, however, uses reference books: the use of a MACHINE-READABLE dictionary or thesaurus may help a computer make reasonable guesses about which sense of an ambiguous word is intended in a particular context. Another strategy relies on the statistical properties of large corpora to determine word relationships. Such methods have allowed parsing without writing a grammar in advance, a higher quality of error correction in spelling, and the automatic recognition of phrases. However, they handle uncommon constructions less well than the grammar-based procedures handle them, and depend for their success on the fact that such constructions are uncommon. See COMPUTERESE, COMPUTER USAGE, CONCORDANCE, CORPUS, EMOTICON, ICON.
Applications relevant to elementary particle and high-energy physics (HEP) computing can be categorized as follows:
- Triggering and data acquisition
- Data handling and storage
- Commodity hardware and software
- Data analysis and visualization
- Control and monitoring systems
- Information systems and multimedia
Triggering and Data Acquisition
In addition to their specialized detection and measurement systems (for example, calorimeters, drift chambers, etc.), the detectors in high-energy physics experiments are, in fact, sophisticated computing systems. Without reliable triggering and data acquisition (DAQ) computing systems present in these detectors, all other experimental computing is of little consequence. Triggering and DAQ systems ensure that the physics events occurring in the detector are observed, measured, and accurately transformed into analyzable data. In a typical experiment, the first level trigger, implemented in hardware, initiates the data collection. Data from the front-end electronics are digitized and collected with electronic data modules. A readout computer reads experiment conditions from the control system, reads event fragments from the data modules over the local network, and builds events from fragments. These event data are then written to buffer storage and/or transmitted via a local network to archival storage for additional processing and eventual analysis.
The scale of triggering and DAQ systems can be seen in the design of the ALICE experiment at the Large Hadron Collider (LHC) at the European Laboratory for Particle Physics (CERN) in Geneva, Switzerland. The ALICE detector will measure up to 20,000 particles in a single interaction event resulting in a data collection rate of approximately seventy five million bytes per event. The event rate is limited by the bandwidth of the data storage system. Higher rates are possible by selecting interesting events and subevents or efficient data compression.
A computer simulation (sometimes referred to as a Monte Carlo simulation) of particle interactions in the experimental configuration is essential to most HEP experiments. Computer software providing these simulations plays a fundamental role in the design of detectors and shielding components, in the investigations of the physics capabilities of the experiment, and in the evaluation of background (nonexperimental, for example, cosmic and/or terrestrial radiation) data. Simulation software must be complete and capable of generating simulated experimental data comparable in scope to genuine experimental data. The simulation software must support a description of the experimental detector from the point of view of the materials used and the geometry adopted, both for the structural and the active event-detecting components. The configurations adopted for the data output and the logic of DAQ on the quality of the physics results are also modeled in order to evaluate their impact on the overall performance of the detector. The simulation must be able to describe the properties and the physics processes of the particles involved both in the expected signal/output and in the background. Especially important is the capability to handle physics processes across a wide energy range, which in such experiments simulation may span several orders of magnitude. An ideal simulation system is also flexible and open to evolution and to the integration of external tools. This is particularly important since a number of software tools are already commonly used within the scientific community where a particular experimental environment may require the ability to extend the simulation functionalities, for instance, to include the ability to deal with peculiar physical processes. One of the most powerful and widely used simulation toolkits is GEANT4 developed at CERN.
Data Handling and Storage
Particle physics experiments generate enormous amounts of data. For example, the BaBar experiment at the Stanford Linear Accelerator Center (SLAC) is designed to accommodate 200 terabytes (200 million million bytes) of data per year. As of April 2002, the BaBar database contained over 500 terabytes of data in approximately 290,000 files. This database is the largest known in the world (with the possible exception of some with military/government content). Such data rates and database sizes push the limits of state-of-the-art data handling and database technologies.
In order to handle such large volumes of data, experiment data handling and storage systems/database must be designed to
- provide reliable and robust storage of the raw detector data, simulation data, and other derived data;
- keep up with production processing; be able to process raw data files within minutes of writing them to tape;
- provide easy, rapid, and intuitive access to data on a variety of systems at a wide variety of locations where processing and data storage resources are available to physicists;
- provide accurate detailed information on the processing steps that transformed event data— from the trigger through reconstruction and all the way to the creation of individual or group datasets;
- provide mechanisms for policy-based allocation and use of disk, central processing unit (CPU), network, and tape drive resources.
Commodity Hardware and Software
Commodity hardware and software refers to the hardware and software architectures and configurations used to accomplish off-line batch and interactive data processing. In the past such processing was often accomplished by large mainframe computers. In recent years, large (200 or more) compute farms of inexpensive computers have become a common replacement for these mainframe systems. These compute farms are fundamentally groups of networked desktop systems (without monitors and keyboards) that are housed in a single location and which function as a single entity. A computer farm streamlines internal processes by distributing the workload between the individual components of the farm and expedites computing processes by harnessing the power of multiple CPUs. The farms rely on load-balancing software that accomplishes such tasks as tracking demand for processing power from different machines, prioritizing the tasks, and scheduling and rescheduling them depending on priority and demand that users put on the network. When one computer in the farm fails, another can step in as a backup. Combining servers and processing power into a single entity has been relatively common for many years in research and academic institutions. Compute farms provide an effective mechanism for handling the enormous amount of computerization of tasks and services that HEP experiments require. Farms of Intel-based computers running the Linux operating system (OS) have become common at many HEP institutions.
The computing grid is the next generation of compute farms. A grid is a distributed system of computing resources (a cyberinfrastructure) in which computers, processor farms, disks, major databases, software, information, collaborative tools, and people are linked by a high-speed network. The term grid was coined as a result of the analogy with an electrical power distribution system. Grid resources are made available transparently to a distributed community of users through a set of new middleware that facilitates distributed collaborative working in new ways. The nine-institution Particle Physics Data Grid collaboration—consisting of Fermi National Laboratory, SLAC, Lawrence Berkeley Laboratory, Argonne National Laboratory, Brookhaven National Laboratory, Jefferson National Laboratory, CalTech, the University of Wisconsin, and the University of California at San Diego—will develop the distributed computing concept for particle physics experiments at the major U.S. high-energy physics research facilities.
Data Analysis and Visualization
Analysis systems are often at the core of an experiment's physics efforts, and the constraints imposed by those systems can heavily influence the physics event reconstruction and analysis framework. Conversely, an analysis system which lacks key features (or worse, implements them incorrectly) can be a serious handicap. Physicists are constantly searching for new and interesting ways to extract physical information through two-dimensional and three-dimensional computer visualization/modeling, animation, histogram plotting, etc. Key also is the development of techniques for data interactivity— methods for interacting with a program or data. These techniques often include graphical user interfaces (GUIs) but also scripting, browsing and other technologies. There have even been some attempts to utilize virtual reality techniques wherein a physicist becomes "immersed" in experimental data. Development of data analysis and visualization tools has been the subject of numerous international collaborations. The result has been the creation of specialized software libraries used, supported, and maintained by these collaborations but generally available to all physicists.
Control and Monitoring Systems
The infrastructure surrounding experiment detectors is highly complex. The hardware devices used in detectors and the systems of experiments consist of commercial devices used in industry, specific devices used in physics experiments, and custom devices designed for unique application. The control and monitoring system must insure that these devices interface correctly with one another by providing testing and error diagnostic functionality. The administrative component of a control and monitoring system provides access to the control of an experiment often distributed between supervision and process control functions.
Information Systems and Multimedia
The World Wide Web (WWW) is the best example of how the requirements of physics research and the need for experimental collaboration have led to developments in information systems and multimedia. The Web was developed to allow physicists in international collaborations to access data and information easily, quickly, and in a device-independent (i.e., computer and operating system) manner. There has been increasing use of collaborative environments supporting point-to-point and multipoint videoconferencing, document, and application sharing across both local and wide area networks; video on demand (broadcast and playback); and interactive text facilities. Resources such as the HEP preprints database at SLAC and the Los Alamos National Laboratory electronic preprint server, officially known as the e-Print Archive, support physicist research and authoring. The first U.S. web server, at SLAC, was installed to provide access to the pre-prints database. Other information systems and multimedia applications include electronic logbooks used to improve and replace paper logbooks, and streaming media servers to provide widespread access to seminars and lectures.
See also:Detectors, Collider
Foster, I., and Kesselman, C., eds. The GRID: Blueprint for a New Computing Infrastructure (Morgan Kaufmann, San Francisco, 1999).
Particle Physics Data Grid. <http://www.ppdg.net>.
White, B. "The World Wide Web and High-Energy Physics." Physics Today51 (11), 30–36 (1998).
com·put·er / kəmˈpyoōtər/ • n. an electronic device for storing and processing data, typically in binary form, according to instructions given to it in a variable program. ∎ a person who makes calculations, esp. with a calculating machine.