What is memory caching?

Memory caching (often simply referred to as caching) is a technique in which computer applications temporarily store data in a computer’s main memory (i.e., random access memory, or RAM) to enable fast retrievals of that data. The RAM that is used for the temporary storage is known as the cache. Since accessing RAM is significantly faster than accessing other media like hard disk drives or networks, caching helps applications run faster due to faster access to data. Caching is especially efficient when the application exhibits a common pattern in which it repeatedly accesses data that was previously accessed. Caching is also useful to store data calculations that are otherwise time-consuming to compute. By storing the calculations in a cache, the system saves time by avoiding the repetition of the calculation.

Memory Caching Diagram.
An Overview of Memory Caching

How Does Memory Caching Work?

Memory caching works by first setting aside a portion of RAM to be used as the cache. As an application tries to read data, typically from a data storage system like a database, it checks to see if the desired record already exists in the cache. If it does, then the application will read the data from the cache, thus eliminating the slower access to the database. If the desired record is not in the cache, then the application reads the record from the source. When it retrieves that data, it also writes the data to the cache so that when the application needs that same data in the future, it can quickly get it from the cache.

Since the cache is limited in size, eventually some data already in the cache will have to be removed to make room for new data that the application most recently accessed. This means the caching system needs a strategy on which records to remove to make room. The strategy will depend on the nature of the application’s data accesses, and will generally try to remove records that are not expected to be accessed again soon. For example, a least-recently-used (LRU) strategy will remove the record whose last access was before any other record in the cache. The assumption here is that if it has been a long time since the record was accessed, it will likely not be accessed again soon. Or to put it another way, the records that were most used recently will likely to be used again soon. A least-frequently-used (LFU) strategy entails tracking the number of accesses of each record in the cache and removing the record with the least amount of accesses. The assumption here is that an infrequently used record will not likely be used again soon.

The challenge with caches is how to minimize “cache misses,” i.e., attempted reads by the application for records that are not in the cache. If you have too many misses, the efficiency of your cache decreases. An application that only reads new data would not benefit from a cache, and in fact, would exhibit lower performance because of the extra work of checking the cache yet not finding the desired record in it. One way this challenge can be mitigated is by leveraging larger caches. This is often not practical on a single computer, which is why distributed caches are popular choices for speeding up applications that need to access larger data sets. A distributed cache pools together the RAM of multiple computers connected in a cluster so that you can create a bigger cache that can continue to grow by adding more computers to the cluster. Technologies like Hazelcast IMDG can be used as a distributed cluster to accelerate large-scale applications.

Another challenge of caches is the risk of reading “stale” data, in which the data in the cache does not reflect the latest data in the underlying source. Oftentimes this risk is an acceptable trade-off for the sake of application performance. In cases where it is not, it is up to the application that updates the underlying data source to update the record in question in the cache.

Example Use Cases

One broad use case for memory caching is to accelerate database applications, especially those that perform many database reads. By replacing a portion of database reads with reads from the cache, applications can remove latency that arises from frequent database accesses. This use case is typically found in environments where a high volume of data accesses are seen, like in a high traffic web site that features dynamic content from a database.

Another use case involves query acceleration, in which the results of a complex query to a database is stored in the cache. Complex queries running operations such as grouping and order can take a significant amount of time to complete. If queries are run repeatedly, as is the case in a business intelligence (BI) dashboard accessed by many users, storing results in a cache would enable greater responsiveness in those dashboards.

Related Topics

JCache / Java Cache

Cache Miss

Hibernate Second-Level Cache

Further Reading

Caching Use Case

Database Caching

Hazelcast Cloud

Caching Made Bootiful: Spring Cache + Hazelcast

Relevant Resources

White Paper

Caching Strategies Explained

This white paper provides a general overview of different strategies for application caching. It explains the advantages and disadvantages, as well as when to apply the appropriate strategy. Additionally, the paper gives a short introduction to JCache, the standard Java Caching API, as well as insight into the characteristics of the Hazelcast IMDG® JCache implementation and how it helps integrate different caching strategies into your application landscape.
Case Study

In-Memory Caching at the #2 eCommerce Retailer in the World

With $18.3 Billion in annual online sales, this global provider of personal computers and electronics has one of the most highly trafficked eCommerce web sites in the world second only to Amazon.com. Burst traffic during new product introductions (NPI) are at an extreme scale, as are sales on Black Friday, Cyber Monday and over holidays. This unique combination of world-class brand experience and extreme burst performance scaling led this eCommerce giant to examine In-Memory Computing solutions as a way to achieve the highest possible price-performance.
View All Resources