What is memory caching?

Memory caching (often simply referred to as caching) is a technique in which computer applications temporarily store data in a computer’s main memory (i.e., random access memory, or RAM) to enable fast retrievals of that data. The RAM that is used for the temporary storage is known as the cache. Since accessing RAM is significantly faster than accessing other media like hard disk drives or networks, caching helps applications run faster due to faster access to data. Caching is especially efficient when the application exhibits a common pattern in which it repeatedly accesses data that was previously accessed. Caching is also useful to store data calculations that are otherwise time-consuming to compute. By storing the calculations in a cache, the system saves time by avoiding the repetition of the calculation.

Memory Caching Diagram.
An Overview of Memory Caching

How Does Memory Caching Work?

Memory caching works by first setting aside a portion of RAM to be used as the cache. As an application tries to read data, typically from a data storage system like a database, it checks to see if the desired record already exists in the cache. If it does, then the application will read the data from the cache, thus eliminating the slower access to the database. If the desired record is not in the cache, then the application reads the record from the source. When it retrieves that data, it also writes the data to the cache so that when the application needs that same data in the future, it can quickly get it from the cache.

Since the cache is limited in size, eventually some data already in the cache will have to be removed to make room for new data that the application most recently accessed. This means the caching system needs a strategy on which records to remove to make room. The strategy will depend on the nature of the application’s data accesses, and will generally try to remove records that are not expected to be accessed again soon. For example, a least-recently-used (LRU) strategy will remove the record whose last access was before any other record in the cache. The assumption here is that if it has been a long time since the record was accessed, it will likely not be accessed again soon. Or to put it another way, the records that were most used recently will likely to be used again soon. A least-frequently-used (LFU) strategy entails tracking the number of accesses of each record in the cache and removing the record with the least amount of accesses. The assumption here is that an infrequently used record will not likely be used again soon.

The challenge with caches is how to minimize “cache misses,” i.e., attempted reads by the application for records that are not in the cache. If you have too many misses, the efficiency of your cache decreases. An application that only reads new data would not benefit from a cache, and in fact, would exhibit lower performance because of the extra work of checking the cache yet not finding the desired record in it. One way this challenge can be mitigated is by leveraging larger caches. This is often not practical on a single computer, which is why distributed caches are popular choices for speeding up applications that need to access larger data sets. A distributed cache pools together the RAM of multiple computers connected in a cluster so that you can create a bigger cache that can continue to grow by adding more computers to the cluster. Technologies like Hazelcast IMDG can be used as a distributed cluster to accelerate large-scale applications.

Another challenge of caches is the risk of reading “stale” data, in which the data in the cache does not reflect the latest data in the underlying source. Oftentimes this risk is an acceptable trade-off for the sake of application performance. In cases where it is not, it is up to the application that updates the underlying data source to update the record in question in the cache.

Strategies for Managing Cache Space

Cache management strategies are used to determine which data to keep in cache and which data to remove when the cache is full and needs to make room for new data. There are several strategies for managing cache space, including:

  1. Least-recently-used (LRU): This strategy removes the data that was least recently accessed from the cache to make room for new data. The assumption is that data that has not been accessed for a long time is less likely to be accessed again in the near future.
  2. Least-frequently-used (LFU): This strategy removes the data that has been accessed the fewest number of times from the cache. The idea is that data that doesn’t get used very often won’t be used again in the near future.
  3. First-in, first-out (FIFO): This strategy removes the data that was added to the cache first to make room for new data. The assumption is that the data that has been in the cache the longest is less likely to be accessed again.
  4. Last-in, first-out (LIFO): This strategy removes the data that was added to the cache most recently to make room for new data. The assumption is that the data that was most recently added is the least likely to be accessed again.
  5. Random replacement: This strategy takes data out of the cache at random to make room for new information.

The choice of cache management strategy will depend on the nature of the data being cached and the characteristics of the application that is using the cache. Some types of data or applications may work better with some strategies than with others.

Types of Memory Caching

There are several types of memory caching, including:

  1. CPU cache: This is a small amount of memory built into a computer’s central processing unit (CPU) that stores frequently accessed data and instructions. The CPU cache makes it faster for the CPU to get to data and instructions, so it doesn’t have to go to the slower main memory or storage devices as often.
  2. Memory cache: This is a small portion of main memory (RAM) set aside as a temporary storage area for frequently accessed data. Memory caching helps to improve the performance of applications by reducing the time it takes to access data from slower storage media like hard disk drives or networks.
  3. Disk cache: This is a portion of main memory (RAM) used to store data that has been recently read from or written to a disk, such as a hard disk drive or solid-state drive. Disk caching helps reduce the number of read and write operations to the disk, improving the overall performance of the system.
  4. Browser cache: This is a temporary storage area for web content, such as HTML pages, images, and other media, that is stored in a web browser’s cache. When a user visits a webpage, their browser stores a copy of the page’s content in the cache. When the user revisits the same webpage, the browser can load the content from the cache rather than downloading it again, which can improve the page’s loading time.
  5. Distributed cache: This is a cache that is shared by multiple computers in a network and is used to store frequently accessed data that is distributed across multiple servers. By reducing the need to access data from multiple servers, distributed caching can improve the performance of distributed systems. Distributed caching can also improve the scalability of an application as the data can be cached in multiple locations, meaning more concurrent users can access the data with fewer requests

Memory Caching and System Performance

Memory caching contributes significantly to a computer system’s overall performance by reducing the time it takes to access data and instructions. When a computer program needs to access data, it first checks the cache to see if the data is already there. If it is, the program can access the data from the cache much faster than it can access data from slower storage media, such as hard disk drives or networks. This speeds up the program’s execution and improves the system’s overall performance.

Memory caching can improve overall system performance by reducing the workload on slower storage devices and networks, in addition to improving individual program performance. The system can reduce the number of read and write operations to these devices by storing frequently accessed data in the cache, freeing them up to handle other tasks. This may result in better system performance and responsiveness.

Memory caching can also affect the performance of distributed systems that use distributed cache architectures such as distributed hash tables (DHTs). Caching helps reduce the time it takes to access data that is distributed across multiple servers in these systems, improving overall system performance.

Overall, memory caching is an important technique for improving computer system performance by decreasing the time it takes to access data and instructions and reducing the workload on slower storage devices and networks.

Example Use Cases

One broad use case for memory caching is to accelerate database applications, especially those that perform many database reads. By replacing a portion of database reads with reads from the cache, applications can remove latency that arises from frequent database accesses. This use case is typically found in environments where a high volume of data accesses are seen, like in a high traffic web site that features dynamic content from a database.

Another use case involves query acceleration, in which the results of a complex query to a database is stored in the cache. Complex queries running operations such as grouping and order can take a significant amount of time to complete. If queries are run repeatedly, as is the case in a business intelligence (BI) dashboard accessed by many users, storing results in a cache would enable greater responsiveness in those dashboards.

Related Topics

JCache / Java Cache

Cache Miss

Hibernate Second-Level Cache

Distributed Cache

Further Reading

Caching Use Case

Database Caching

Architectural Patterns for Caching Microservices

Caching Made Bootiful: Spring Cache + Hazelcast

Relevant Resources

White Paper

Caching Strategies Explained

This white paper provides a general overview of different strategies for application caching. It explains the advantages and disadvantages, as well as when to apply the appropriate strategy.

Additionally, the paper gives a short introduction to JCache, the standard Java Caching API, as well as insight into the characteristics of the Hazelcast IMDG® JCache implementation and how it helps integrate different caching strategies into your application landscape.

Case Study

In-Memory Caching at the #2 eCommerce Retailer in the World

With $18.3 Billion in annual online sales, this global provider of personal computers
and electronics has one of the most highly trafficked eCommerce web sites in the world second only to Amazon.com. Burst traffic during new product introductions (NPI) are at an extreme scale, as are sales on Black Friday, Cyber Monday and over holidays.

This unique combination of world-class brand experience and extreme burst performance scaling led this eCommerce giant to examine In-Memory Computing solutions as a way to achieve the highest possible price-performance.

View All Resources