How to Scale Hazelcast IMDG on Kubernetes
Hazelcast IMDG is well integrated with the Kubernetes environment. Using Hazelcast Kubernetes Plugin, Hazelcast members can discover themselves automatically and be grouped by services to form separate clusters in one namespace. What’s more, using Hazelcast Helm Charts you can deploy a fully functional Hazelcast cluster with a single command. If you’d like further detail, you can check out a detailed webinar on deploying cloud-native applications in Kubernetes here.
Now, it’s time to focus on the operational part and describe what to do if you want to scale up or down the number of Hazelcast members in a cluster.
Scale Natively!
The good news is that you can scale Hazelcast as any other StatefulSet or Deployment.
As an example, let’s start from a working Hazelcast cluster, which you can deploy using Helm Chart or Kubernetes Code Sample.
$ kubectl get statefulset NAME DESIRED CURRENT AGE my-release-hazelcast 2 2 1m
You can scale up the cluster to 6 members using the following command.
$ kubectl scale statefulset.apps/my-release-hazelcast --replicas=6
After a second you should see that new Hazelcast PODs are created.
$ kubectl get statefulset NAME DESIRED CURRENT AGE my-release-hazelcast 6 6 5m
And in the logs, you can confirm they all joined the same cluster.
$ kubectl logs pod/my-release-hazelcast-0 ... Members {size:6, ver:6} [ Member [10.16.1.18]:5701 - dabccec9-cc0b-43ae-83ec-54ffc392c87e this Member [10.16.0.16]:5701 - 3777eeba-9de1-4724-aaea-ee2cffa0edde Member [10.16.1.19]:5701 - 782c342d-92c6-46f5-934f-85b60fb413ac Member [10.16.0.17]:5701 - c1ee9ccc-2942-416a-9125-acc00aecd792 Member [10.16.2.10]:5701 - cb11a094-01ec-4bef-8485-a25b73d4831b Member [10.16.2.11]:5701 - b992b507-291e-4570-85d9-4a91f1624fb9 ] ...
Similarly, you could scale down the cluster.
The fact that scaling works out of the box is especially important in case of Hazelcast embedded in a microservice. Then, just by scaling the number of microservice replicas, you scale the Hazelcast cluster.
Scale without Data Loss!
Scaling down Hazelcast cluster in the way described above can result in data loss. That is related to the way how Kubernetes works and how Hazelcast works. Let’s discuss it in more detail and then present a method that prevent losing any data.
Kubernetes Background
When Kubernetes decides to terminate a POD, it follows the procedure:
- Send
SIGTERM
signal - Wait
terminationGracePeriodSeconds
(30 seconds by default) - Send
SIGKILL
Hazelcast Background
By default, Hazelcast stores its data with one backup (the number of backups can be controlled with the backup-count
property). It means that if you suddenly terminate two Hazelcast members, you can lose both the primary and the backup entry.
Hazelcast has a feature called Graceful Shutdown, which is disabled by default. It changes Hazelcast’s reaction to the SIGTERM
signal – instead of just killing the member, it first waits for the data migration. You can control Hazelcast Graceful Shutdown with the following JVM parameters:
hazelcast.shutdownhook.policy=GRACEFUL
: enables Graceful Shutdown Policyhazelcast.graceful.shutdown.max.wait=<seconds>
: sets the maximum time to wait for the Graceful Shutdown
Solution
With this knowledge, we can set these 3 parameters to provide the data safety while scaling down:
terminationGracePeriodSeconds
: in your StatefulSet (or Deployment) configuration; the value should be high enough to cover the data migration process-Dhazelcast.shutdownhook.policy=GRACEFUL
: in the JVM parameters-Dhazelcast.graceful.shutdown.max.wait
: in the JVM parameters; the value should be high enough to cover the data migration process
Note that it makes perfect sense to set hazelcast.graceful.shutdown.max.wait
to the same value as terminationGracePeriodSeconds
. The semantics of both parameters is the same, but one is from the perspective of Hazelcast and the other from the perspective of Kubernetes.
Scale with Helm Chart!
The good news is that the Graceful Shutdown is already implemented in Hazelcast Helm Charts. That is why scaling without data loss is super easy. Let’s start a Hazelcast cluster with Helm.
$ helm install --name my-release --set cluster.memberCount=6 stable/hazelcast
Now, to scale the cluster, you can either use Helm or Kubectl.
$ helm upgrade my-release --set cluster.memberCount=3 stable/hazelcast
$ kubectl scale statefulset.apps/my-release-hazelcast --replicas=3