How Does Caching Work in Django?

Most web applications often persist data from one request to the next. Web applications are dynamic, making calculations and database queries to serve requests. This overhead can be expensive, especially when we need the same data repeatedly. 

Django-based applications are no exception to this challenge. Ideally, we’d like a place to persist data between requests with quick reads and writes so we can avoid excessive database calls and complex calculations. Fortunately, Django has built-in caching with support for multiple cache backends.

Caching is incredibly important to most production Django applications. Often, we need to store data between requests, meaning we must persist the data. In these cases, we probably don’t want to write to and read from an external data store explicitly. Or, we may need to store template output, our query results, or expensive computational results — perhaps from a celery task.

Rather than repeating the same queries or computations for every user who visits the homepage, it makes sense to store those results and only recalculate them periodically. Any data that doesn’t need to change between requests is an excellent candidate for caching.

Django can use cache backends for two purposes: session storage, which stores data private to individual users, and cache storage, which all users share.

Caching Options in Django

Local Memory Cache

Unless we explicitly specify another caching method in our settings file, Django defaults to local memory caching. As its name implies, this method stores cached data in RAM on the machine where Django is running. Local memory caching is fast, responsive, and thread-safe. 

The downside is that it works best if we’re only running a single instance of Django. Local memory storage can work across multiple Django instances if we’re using it for session storage. However, we also need to set up a reverse proxy like Nginx with sticky routing support to route all requests from a specific user to the same Django instance. 

Local memory caching falls short, however, if we need to share cached data. Anything added to the cache on a single Django instance will be visible only on that instance. So if we’ve scaled our app out by running multiple instances of it, local memory storage is less than ideal as a shared cache. 

Also, storing cached data as key-value pairs in local memory can be expensive in memory consumption. While this approach lightens our application’s network load, it can still end up causing each Django instance to consume more RAM than necessary.

To set up local memory caching, the CACHES section of our Django settings file should look like this (swap out the cache location for yours):

CACHES = {

  'default': {

    'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',

    'LOCATION': 'your-memory-location-will-go-here-see-docs-for-more',

  }

Filesystem Cache

A filesystem cache uses significantly less memory than in-memory caching. However, it comes at the cost of being considerably slower than an in-memory cache. Aside from those two factors, its trade-offs are similar to those of local memory caching.

The CACHES section of your Django settings file should look something like this:

CACHES = {

  'default': {

    'BACKEND': 'django.core.cache.backends.filebased.FileBasedCache',

    'LOCATION': 'filesystem directory where Django will save cache files',

  }

Memcached

Memcached is an efficient cache implementation. It’s the fastest caching method that Django works with out of the box. You may be surprised to hear that many high-traffic sites rely on Memcached to reduce database queries, including Facebook and Wikipedia. 

Memcached reserves a section of memory for use as cache space. It performs well, making it a popular choice among Django developers. Memcached runs as a daemon, so we must install it by itself and set it up independently of our Django instances. This method requires some extra work but reduces our Django app’s memory consumption since it offloads all cached data to Memcached. 

We can connect a Django app to a Memcached daemon by adding the cache settings and location with IP address and port number in the CACHES section of the Django settings file.

One such configuration (running on localhost) looks like this:

CACHES = {

  'default': {

      'BACKEND': 'django.core.cache.backends.memcached.PyMemcacheCache',

      'LOCATION': '127.0.0.1:11211',

  }

}

Note that we are not limited to using Memcached itself. We can use any backend which presents a Memcached-compatible interface. For example, we can connect to Hazelcast using a Memcache client. This approach is a good choice when we expect we’ll eventually need caching capabilities beyond what Memcached provides. We can start out using Hazelcast’s Memcached compatibility, then switch to using Hazelcast directly when we need.

Database Caching

Django is also capable of leveraging our existing database for caching. Setup is straightforward: we just need to provide the name of the database table to store cache data. A database-backed cache doesn’t perform as well as in-memory caching. Still, it’s the easiest way to enable distributed caching in a Django app because it uses the same database our app is already using. 

Use caution: a database-backed cache increases the overall load on our database, which may decrease our application’s overall performance. 

To use database caching, we place the following code in the CACHES section of the settings file:

CACHES = {

    'default': {

        'BACKEND': 'django.core.cache.backends.db.DatabaseCache',

        'LOCATION': 'the_table_name_for_the_cache',

    }

}

Custom Cache Implementation

If we don’t like any of Django’s default caching options, we can add our own. We just need to create a class that extends BaseCache. As a starting point, we can look at Django’s dummy cache implementation. Our custom cache provider class should implement the same methods as the dummy cache class, replacing the code in each method with code that interacts with our preferred cache storage backend. 

We can configure a custom cache implementation by providing only the path of our provider class.

To use a custom implementation, the CACHES section in our settings file should look like this, with the appropriate path swapped in:

CACHES = {

  'default': {

      'BACKEND': 'path.to.backend',

  }

}

Which Caching Option is Best?

The best caching option depends on our needs.

It’s often easiest to start with a local memory cache, but as we’ve seen, this approach has trade-offs that might make it a poor fit for our use case. 

A filesystem cache has many of the same trade-offs as a local memory cache, except that it is slower. Still, it uses less RAM. 

Our options are somewhat limited if our application uses more than one Django instance but needs a single shared cache. Database caching is likely the easiest way to move to a caching method that works with multiple Django instances. But an on-disk database cache can quickly become a performance bottleneck to our web app’s overall responsiveness.

Memcached is swift and works well enough with multiple Django instances. Still, being a naïve cache, Memcached has limited extra functionality. It also requires additional work to set up and administer. In most cases, Memcached or a custom cache provider backed by a shared in-memory data store (such as Hazelcast) is the best option for Django apps operating at scale.

Writing a Custom Cache Backed by Hazelcast

Django provides great options out of the box so we can select a cache backend that fits our app’s needs. However, as we explained, we can write a custom cache implementation for Django, swapping the implementation with almost anything that provides the efficiency we want.

Hazelcast cluster performs well for this purpose, giving our Django application drop-in access to a high-speed in-memory database for caching. Hazelcast offers features that a standard cache couldn’t possibly provide, opening the door to additional application feature possibilities. Hazelcast provides high-performance distributed data structures. 

When we’re ready to move beyond just using Hazelcast as a cache, we can use the Hazelcast Python client to access even more advanced functionality. The Hazelcast platform is straightforward to learn, and you can sign up for free access on the getting started page.

Conclusion

If your Django application serves dynamic content when users hit endpoints, as most do, the odds are high that it will benefit from caching. Caching recently or frequently accessed data can provide incredible performance benefits, lowering the app’s response time and resource consumption. 

Django is incredibly versatile in its cache options so that you can choose the best one for your app. Suppose an existing solution like Memcached is unavailable (or just not good enough). In that case, you can even swap in your custom implementation — and Hazelcast can serve as a fast cache backend that gives you room to grow, with a rich set of distributed data structures and support for distributed computing. You can quickly get started with Hazelcast, whether you use it in the cloud or locally using Docker, so try it today.

Keep Reading