Cache memory refers to a fast storage buffer in the central processing unit (CPU) of a computer, allowing the computer to store data temporarily, making information retrieval faster and more efficient. By storing often-used data in a special memory chip rather than accessing the memory of the computer for the same information each time, cache memory helps maximize the efficiency of the CPU.
Typically, in order to execute a command, the processor in a computer places a numeric address for the instruction it is about to execute on the address bus . Once the memory subsystem senses the address, it deposits the code representing that instruction onto the data bus. The processor then collects this code from the data bus, interpreting it as a command of some sort. The execution of this instruction may involve several operations similar to the one that enabled the processor to fetch the instruction in the first place. For example, the processor may discover that the instruction it just fetched requires it to get some data from memory and then add that data to a register. Whatever the nature of the instruction, once it is complete, the processor must repeat the instruction fetching cycle for the next instruction in the program it is currently executing.
The rate at which the processor can execute instructions partly determines its perceived performance—therefore, it would help tremendously if the next instruction that the processor was going to fetch was located or retrieved for it automatically whilst it was busy executing the previous one. Cache memory allows the processor to do exactly that.
Although the simultaneous functionality discussed earlier introduces a little more complexity into the system, the benefits are significant, and most modern processors incorporate a small amount of memory within them. This block of memory, also called a cache memory, is often built into the processor core itself. Cache memory is managed by another unit, called the cache controller, and is implemented from high-speed, and therefore comparatively expensive, memory devices.
The intent is to increase the average speed at which a program can be executed. This is accomplished when the cache controller tries to pre-fetch blocks of instructions that are to be executed by the processor, storing them in its high-speed cache. Because the instructions are now instantly available, the processor need not wait for each instruction to be fetched in sequence before execution.
Despite their advantages, caches are not completely foolproof. Since the cache cannot know with complete certainty which instruction the processor is going to need next, it selects groups of instructions that happen to be in memory, and close to the last instruction that was executed. The cache relies on a correlation that suggests that when processors execute programs, the instructions tend to be fetched in order in memory. However, it is quite possible that the cache controller will, on some occasions, fetch blocks of instructions from the wrong place. There are several reasons why this would happen. For example, the processor may have just executed an instruction that commands it to jump to another part of the program, which might be quite distant from the current point of execution. Whenever the cache controller correctly predicts the next block of instructions needed by the processor, it is referred to as a cache hit. When the converse happens though, it is described as a miss.
A number of factors can affect the hit rate, and therefore the average speed, of program execution. For example, if the cache is large, it statistically increases the chances of it retrieving the correct pieces of information. However, it also increases the cost and complexity of the cache since it is now somewhat more difficult to manage. Caches tend to work very well when the programs that are being executed are structured as a straightforward sequence of instructions. This can be accomplished by having development tools such as compilers and interpreters take on the responsibility of organizing the memory image.
In addition to blocks of instructions, caches are applied with equal validity to blocks of data needed by programs. Many modern processors incorporate separate instruction and data caches of various sizes, depending on which combination helps optimize the performance of the processor. On a larger scale, but employing exactly the same principle, are data caches for fixed disks and for servers in a distributed network. It is possible to attach special cache units to disk drive controllers. These cache units attempt to speed up the average access time of the disk by predicting what portions might be needed next, pre-loading these into memory set aside on the disk controller based on currently accessed file(s). Similarly, when client computers are accessing a World Wide Web (WWW) server, they might store on their own disk a collection of documents and images that have been recently accessed. Then, if these documents are browsed again soon afterward, they can be reloaded from the local disk cache rather than transferring them from the server again.
Whatever the scale and implementation, caching is a statistically based approach that tries to enhance the average performance of a system by attempting to anticipate the information that will be required, and having it ready ahead of time.
Baron, Robert J., and L. Higbie. Computer Architecture. Reading, MA: Addison-Wesley, 1992.
Beck, Michael, H. Bohme, M. Dziadzka, U. Kunitz, R. Magnus, and D. Verworner. Linux Kernel Internals, 2nd ed. Harlow, England: Addison-Wesley, 1998.
Hayes, John P. Computer Architecture and Organization. Tokyo: McGraw-Hill, 1979.
Stone, Harold S. High-Performance Computer Architecture, 3rd ed. Reading, MA: Addison-Wesley, 1993.
Cache Memories in Next Generation CPUs
CACHE MEMORIES IN NEXT GENERATION CPUS
Intel Corporation of Santa Clara, California, has indicated that there will be three levels of cache memories within their "Madison" Itanium processor, and the largest cache block will be a staggering 6 megabytes in size. This amount of memory constituted approximately the entire memory subsystem of popular microcomputers of only a decade earlier.