Caching Best Practices

Glossary Terms›Caching Best Practices

In most data retrieval patterns, a cache will speed up your overall throughput and latency. This is often true even if you don’t apply any real planning to your implementation. But this does not mean you can add a cache and expect significant performance improvements without any planning. By considering some best practices for your implementation, you can maximize the performance benefits from your cache.

Below are a few best practices for caching that you should consider for your implementation.

Caching is a very small part of your overall infrastructure requirements, so unless you can plug and play a caching solution in a day or two, you are probably spending too much time on an issue that the rest of your business might perceive as low value.

Considering that many “caching” technologies in the market can do so much more, it’s worth viewing caching as merely a capability that is available in broader solutions. In other words, plan at the use case level of what you’re trying to accomplish, and choose a technology that not only supports your caching needs, but also addresses your other requirements for your data architecture.

Hazelcast Platform is an example of a technology that easily addresses your caching needs. You can deploy it as a cache and solve your short-term challenges. But since it also includes distributed computing and stream processing capabilities, you can do so much more and build out a powerful system that extracts value from your data in a much easier way to speed up time-to-market and ROI while reducing costs.

In many cases, you want to deploy a cache that applies to multiple target users. By doing so, you can potentially introduce more repetitive data accesses from across distinct user groups that provide further value from your cache.

If you only plan for one target audience, you might not get the performance benefit that would warrant implementing a cache. So if there are economies of scale that can be gained from a bigger user base, be sure to include that larger target audience as users of your implementation.

To see how much value caching can provide to you, you should identify how data is accessed from your data stores. If most of your data reads are unique, then certainly a cache isn’t going to help, and it might actually hurt your overall performance. But if there is a high level of redundancy in your data access, a cache will likely deliver significant performance improvements.

Some metrics you should capture when planning for your caching implementation include:

Ratio of reads to writes
Percentage of repeated reads in various time windows
Overall size of data retrieved in repeated reads
Percentage of repeated reads from each backend store
Latency of retrievals from each backend store

The metrics above will help determine what to cache and how long to keep that data in the cache. This will let you put more emphasis on the important data accesses and free the cache from data access workloads that are not as time-sensitive.

In certain situations, sensitive data (e.g., personally identifiable information [PII], personal health information [PHI]) is cached, so you need to plan ahead to secure your data. Modern technologies that are used for caching have security controls to protect your cached data with role-based access controls along with over-the-wire encryption, to ensure no breach or data tampering occurs.

Hazelcast Platform provides the caching capabilities you need including a full security suite to provide role-based access controls and encryption on your cached data.

Although caching is often associated with ephemeral instances of data, some environments face significant downtime costs if the cached data is suddenly unavailable due to an unforeseen site-wide failure. Even though the data might reside in a separate system of record, the caching system might be part of an operational system that must always be available, and repopulating the cache with data from the system of record is too slow.

Disaster recovery and automatic failover capabilities in your caching system should work in conjunction with your overall disaster recovery strategy to ensure a zero downtime strategy, for those environments that need 24/7 operations.

Hazelcast Platform provides WAN Replication and Automate Disaster Recovery Failover to meet your recovery point objective (RPO) and recovery time objective (RTO) in your stringent disaster recovery strategy.

There are different caching strategies that you should consider when deploying your cache. Be sure to understand these caching strategies to see which one applies to your requirements.

The caching strategies in that page are request-based, i.e., previously cached data is removed when more cache space needs to be recovered for a new data element that was recently requested. You also need to consider how often cached data is refreshed proactively to minimize the risk of returning stale data. Two good options include:

Implement a refresh-ahead cache which has mechanisms to immediately insert newly updated data (in the backend store) into the cache. This access pattern is described in our Cache Access Patterns page.
Set a time-to-live (TTL) value for your cached data elements which tells your cache to expire elements based on a certain length of time. This value will be set based on your estimated rate of updates to the underlying data. The higher the rate of updates in your system, the lower TTL value you need to set.

Hazelcast Platform has features to easily support a refresh-ahead cache pattern, using change data capture (CDC), distributed processing capabilities, and/or stream processing capabilities to deliver updated data to the cache in real-time. Hazelcast Platform also provides TTL settings to proactively expire cached data.

The biggest downside of caching is the risk of returning stale data. The original data source might have been updated, but if the cache isn’t updated with that new data, then an older value will be returned. The question then becomes, how bad is it if you return an old value.

The TTL setting mentioned in the section above is one way to reduce the risk of stale data. The challenge then becomes how to calculate the right value. Note that having a very short TTL value will reduce the risk of returning stale data, but it might also reduce the benefits of the cache if cached data is updated at a rate faster than the data is updated in the backend store.

For systems where stale data can be detrimental, go with a refresh-ahead access pattern to ensure that any data updated in the backend store is immediately inserted into the cache, thus ensuring that cached data is not stale.

As mentioned in the previous section, Hazelcast Platform has features to easily support a refresh-ahead cache pattern, using change data capture (CDC), distributed processing capabilities, and/or stream processing capabilities to deliver updated data to the cache in real-time.

Keep Reading

White Paper

Caching Strategies Explained

This white paper provides a general overview of different strategies for application caching. It explains the advantages and disadvantages, as well as when to apply the appropriate strategy.

Additionally, the paper gives a short introduction to JCache, the standard Java Caching API, as well as insight into the characteristics of the Hazelcast IMDG® JCache implementation and how it helps integrate different caching strategies into your application landscape.

White Paper

Seven Reasons Hazelcast Is the Best Caching Technology for You

If you need your applications to run faster, one way to speed them up is to use a data cache. In this white paper, we will discuss how Hazelcast offers a set of proven capabilities, beyond what other technologies provide, that make it worth exploring for your caching needs.

White Paper

Basic Caches Cannot Measure Up to 21st Century Business Expectations

This whitepaper explains how a digital integration hub can speed up your applications like a cache, but also can add much more value.

Why Hazelcast?

Forrester names Hazelcast as a Strong Performer

Platform

Introducing Hazelcast Platform 5.4

Solutions

By Industry

By Use Case

By Architecture

Join us for a deep dive into Hazelcast Platform's capabilities

Resource Center

Learn

The Gartner®️ Market Guide for Event Stream Processing

Developers

Community

Learn

Toolbox

Table of Contents

See Hazelcast in Action

Sign up for a personalized demo.

Caching Best Practices

Plan For More than Caching

Identify Your Entire Audience

Identify Your Data Access Patterns

Establish Performance Expectations

Secure Your Cached Data

Enable Business Continuity

Understand the Different Caching Strategies

Identify the Downside of Stale Data

More On Caching

Keep Reading

Caching Strategies Explained

Seven Reasons Hazelcast Is the Best Caching Technology for You

Basic Caches Cannot Measure Up to 21st Century Business Expectations

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect