What Is Caching?

A Guide to How Caching Works

Caching (often referred to as memory caching) is a technique in which computer applications temporarily store data in a computer’s main memory (i.e., random access memory, or RAM) to enable fast retrievals of that data. The RAM that is used for the temporary storage is known as the cache. Since accessing RAM is significantly faster than accessing other media like solid-state drives (SSDs), hard disk drives, or networks, caching helps applications run faster due to faster access to data.

An Overview of Memory Caching

Caching is especially efficient when the application exhibits a common pattern in which it repeatedly accesses data that was previously accessed. It's also useful to store data calculations that are otherwise time-consuming to compute. By storing the calculations in a cache, the system saves time by avoiding the repetition of the calculation.

How Does Memory Caching Work?

Memory caching works by first setting aside a portion of RAM to be used as the cache. As an application tries to read data, typically from a data storage system like a database, it checks to see if the desired record already exists in the cache. If it does, then the application will read the data from the cache, thus eliminating the slower access to the database. If the desired record is not in the cache, then the application reads the record from the source. When it retrieves that data, it also writes the data to the cache so that when the application needs that same data in the future, it can quickly get it from the cache.

Since the cache is limited in size, eventually some data already in the cache will have to be removed to make room for new data that the application most recently accessed. This means the caching system needs a strategy on which records to remove to make room. The strategy will depend on the nature of the application’s data accesses and will generally try to remove records that are not expected to be accessed again soon. For example, a least-recently-used (LRU) strategy will remove the record whose last access was before any other record in the cache. The assumption here is that if it has been a long time since the record was accessed, it will likely not be accessed again soon. Or to put it another way, the records that were most used recently will likely to be used again soon. A least-frequently-used (LFU) strategy entails tracking the number of accesses of each record in the cache and removing the record with the least amount of accesses. The assumption here is that an infrequently used record will not likely be used again soon.

The challenge with caches is how to minimize “cache misses,” i.e., attempted reads by the application for records that are not in the cache. If you have too many misses, the efficiency of your cache decreases. An application that only reads new data would not benefit from a cache, and in fact, would exhibit lower performance because of the extra work of checking the cache yet not finding the desired record in it. One way this challenge can be mitigated is by leveraging larger caches. This is often not practical on a single computer, which is why distributed caches are popular choices for speeding up applications that need to access larger data sets. A distributed cache pools together the RAM of multiple computers connected in a cluster so that you can create a bigger cache that can continue to grow by adding more computers to the cluster. Technologies like Hazelcast Platform can be used as a distributed cluster to accelerate large-scale applications.

Another challenge of caches is the risk of reading “stale” data, in which the data in the cache does not reflect the latest data in the underlying source. Oftentimes this risk is an acceptable trade-off for the sake of application performance. In cases where it is not, it is up to the application that updates the underlying data source to update the record in question in the cache.

Types of Caching

  1. CPU cache: This is a small amount of memory built into a computer’s central processing unit (CPU) that stores frequently accessed data and instructions. The CPU cache makes it faster for the CPU to get to data and instructions, so it doesn’t have to go to the slower main memory or storage devices as often.
  2. Memory cache: This is a small portion of main memory (RAM) set aside as a temporary storage area for frequently accessed data. Memory caching helps to improve the performance of applications by reducing the time it takes to access data from slower storage media like hard disk drives or networks.
  3. Disk cache: This is a portion of main memory (RAM) used to store data that has been recently read from or written to a disk, such as a hard disk drive or solid-state drive. Disk caching helps reduce the number of read and write operations to the disk, improving the overall performance of the system.
  4. Browser cache: This is a temporary storage area for web content, such as HTML pages, images, and other media, that is stored in a web browser’s cache. When a user visits a webpage, their browser stores a copy of the page’s content in the cache. When the user revisits the same webpage, the browser can load the content from the cache rather than downloading it again, which can improve the page’s loading time.
  5. Distributed cache: This is a cache that is shared by multiple computers in a network and is used to store frequently accessed data that is distributed across multiple servers. By reducing the need to access data from multiple servers, distributed caching can improve the performance of distributed systems. Distributed caching can also improve the scalability of an application as the data can be cached in multiple locations, meaning more concurrent users can access the data with fewer requests.

Caching and System Performance

Caching contributes significantly to a computer system’s overall performance by reducing the time it takes to access data and instructions. When a computer program needs to access data, it first checks the cache to see if the data is already there. If it is, the program can access the data from the cache much faster than it can access data from slower storage media, such as hard disk drives or networks. This speeds up the program’s execution and improves the system’s overall performance.

Overall system performance can be improved by reducing the workload on slower storage devices and networks, in addition to improving individual program performance. The system can reduce the number of read and write operations to these devices by storing frequently accessed data in the cache, freeing them up to handle other tasks. This may result in better system performance and responsiveness.

Caching can also affect the performance of distributed systems that use distributed cache architectures such as distributed hash tables (DHTs). It helps reduce the time it takes to access data that is distributed across multiple servers in these systems, improving overall system performance.

Overall, caching is an important technique for improving computer system performance by decreasing the time it takes to access data and instructions and reducing the workload on slower storage devices and networks.

Examples of Caching / Use Cases

Caching can be leveraged in a variety of specific use cases such as:

  • Database acceleration
  • Query acceleration
  • Web/mobile application acceleration
  • Cache-as-a-service
  • Web caching
  • Content delivery network (CDN) caching
  • Session management
  • Microservices caching

View Hazelcast caching use cases >

Missed Opportunities with Basic Caching

There is a data visibility limitation regarding caching, meaning it does not create any context for your data. This is because caches are essentially storage units from which you can retrieve data, so they are perfect for spitting out the raw data that you stored within them, but not for giving you any background for that data. Chances are, your data is spread out across separate backend systems, so you will need something that can create context for the diverse forms of data should you need supporting information to understand it beyond what is already stored in the cache.

Furthermore, while caches do offer speedy access to data, they do not enable you to derive insights from the cached data in real-time, almost simultaneously, as events occur (this is where stream processing comes in). Your data loses value to you as time passes, since the time lag involved in collecting the data, combining it all, then analyzing it—perhaps using multiple different applications aside from the cache—consumes valuable seconds that could be used in more productive ways. Unfortunately, caches are not intelligent enough to manipulate, organize, calculate, or analyze data, since their only function is regurgitating stored data. What this means for your company is a slew of missed revenue opportunities that continue to slip through the cracks of your technology infrastructure. Your delayed reaction time, due to the processing time involved with retrieving a certain amount of data from the cache and pushing it through many different backend pipelines, causes you to relinquish a massive amount of actionable information.

A company’s sole reliance on simple cache ushers in the issue of a lack of streamlined and unified presentation of data. This is due to the tedious practice of accessing data, which could be stored in different formats and processing languages, from scattered, heterogeneous backend systems. Caches offer no standardization of how data is delivered to end users and applications. The result: seemingly disconnected and hard-to-use data. When your company possesses many unique sources and databases of data, establishing connections between siloed data sets is crucial for advanced information analysis.

The Future of Caching

In today’s data-driven landscape, businesses and consumers alike require instant insights from real-time streaming data. Relying solely on basic caching methods can result in missed opportunities for organizations. Forward-thinking enterprises are moving beyond traditional caching, embracing technologies like Hazelcast Platform. They’ve discovered they can harness the best of both worlds: leveraging a fast data store alongside a robust stream processing engine to achieve lightning-fast application performance while enabling immediate insights and actions on streaming data.

Keep Reading