CPU speeds continue to increase much more rapidly than memory access times decrease. The result is a growing gap in performance between the CPU and its main RAM memory. To compensate for the growing difference in speeds, engineers add layers of cache memory between the CPU and the main memory. A cache consists of a small, high-speed memory system that holds recently used values. When the CPU makes a request to fetch or store a memory value, the CPU sends the request to the cache. If the item is already present in the cache, the cache can honor the request quickly because the cache operates at higher speed than main memory. For example, if the CPU needs to add two numbers, retrieving the values from the cache can take less than one-tenth as long as retrieving the values from main memory. However, because the cache is smaller than main memory, not all values can fit in the cache at one time. Therefore, if the requested item is not in the cache, the cache must fetch the item from main memory.
Cache cannot replace conventional RAM because cache is much more expensive and consumes more power. However, research has shown that even a small cache that can store only 1 percent of the data stored in main memory still provides a significant speedup for memory access. Therefore, most computers include a small, external memory cache attached to their RAM. More important, multiple caches can be arranged in a hierarchy to lower memory access times even further. In addition, most CPUs now have a cache on the CPU chip itself. The on-chip internal cache is smaller than the external cache, which is smaller than RAM. The advantage of the on-chip cache is that once a data item has been fetched from the external cache, the CPU can use the item without having to wait for an external cache access.