Memory caching works by first setting aside a portion of RAM to be used as the cache. As an application tries to read data, typically from a data storage system like a database, it checks to see if the desired record already exists in the cache. If it does, the application will read the data from the cache, thus eliminating the slower access to the database. If the desired record is not in the cache, the application reads the record from the source instead. It retrieves data and stores it in the cache, allowing the application to quickly access it in the future.
Cache size is limited, so some existing data must be removed to make room for new data the application has recently accessed. This requires the cache system to have a strategy for selecting which records to remove. The strategy depends on data access patterns and focuses on removing rarely accessed records.
Caching Strategies
For example, a least-recently-used (LRU) strategy removes the record that has not been accessed for the longest time compared to others in the cache. The underlying assumption is that if a record hasn’t been accessed in a while, it is less likely to be accessed again soon. In other words, recent records will likely be used again soon.
On the other hand, a least-frequently-used (LFU) strategy involves tracking how many times each record in the cache has been accessed. Under this strategy, the record with the fewest accesses is removed. The assumption is that a record used infrequently will unlikely be accessed again soon.
Read more about caching strategies »
Dealing With Cache Misses
The challenge with caches is minimizing “cache misses,” i.e., attempted reads by the application for records that are not in the cache. If you have too many misses, the efficiency of your cache decreases. An application that only reads new data does not benefit from a cache. It may perform worse because it has to check the cache and often won't find the desired record. One way this challenge can be mitigated is by leveraging larger caches. Using a single computer is often impractical for handling larger data sets, so developers prefer distributed caches to speed up applications.
A distributed cache pools together the RAM of multiple computers connected in a cluster, allowing you to create a bigger cache that can continue to grow by adding more computers to the cluster. Technologies like Hazelcast Platform can be used as a distributed cluster to accelerate large-scale applications.
Handling Stale Data
Another challenge of caches is the risk of reading “stale” data, in which the data in the cache does not reflect the latest data in the underlying source. Often, this risk is an acceptable trade-off for the sake of application performance. In cases where it is not, it is up to the application to update the underlying data source to update the record in question in the cache.