Rolling Upgrade Hazelcast IMDG on Kubernetes Rafal Leszko June 25, 2019 Share Hazelcast IMDG is tightly integrated into the Kubernetes ecosystem thanks to the Hazelcast Kubernetes plugin. In previous blog posts, we shared how to use auto-discovery for the embedded Hazelcast and steps for scaling it up and down using native kubectl commands. In this post, we’ll focus on another useful feature, Rolling Upgrade. You can apply it to your Hazelcast cluster no matter if you use client-server or embedded Hazelcast and regardless of whether you deploy using Kubernetes StatefulSet or Deployment. Everything is, as always with Hazelcast, intuitive and straightforward. Rolling Upgrade StatefulSet The preferred way of deploying Hazelcast on Kubernetes is using StatefulSet. As an example, you can start a cluster using a Helm Chart or Kubernetes Code Sample. When you decide to update the Hazelcast Docker image version in your Kubernetes configuration, then Kubernetes automatically performs the Rolling Upgrade procedure: Send SIGTERM signal to Pod N Wait maximum terminationGracePeriodSeconds (30 seconds by default) Send SIGKILL signal to Pod N Start new Pod N Wait until new Pod N is ready Send SIGTERM signal to Pod N-1 Wait maximum terminationGracePeriodSeconds (30 seconds by default) Send SIGKILL signal to Pod N-1 … This procedure continues until all Pods are replaced with their new versions. Note that Hazelcast’s default reaction to the SIGTERM signal is to terminate the instance suddenly. So now, if Pod N stores some data and the only backup of this data is stored in Pod N-1, then Kubernetes may terminate both Pods before they manage to migrate the data to the remaining members. Such a scenario means that the default behavior may result in data loss during the Rolling Upgrade procedure. The solution for that is to enable Graceful Shutdown for Hazelcast members and to increase the termination grace period to a value that guarantees the migration’s completion in the given time. We can do it with the following Kubernetes configuration: apiVersion: apps/v1 kind: StatefulSet spec: ... template: spec: terminationGracePeriodSeconds: 600 containers: - name: hazelcast ... - name: JAVA_OPTS value: "-Dhazelcast.shutdownhook.policy=GRACEFUL -Dhazelcast.graceful.shutdown.max.wait=600" ... Let’s describe the parameters we used: terminationGracePeriodSeconds: Number of seconds Kubernetes waits before forcing Pod to terminate hazelcast.shutdownhook.policy=GRACEFUL: Enables graceful shutdown for Hazelcast hazelcast.graceful.shutdown.max.wait: Number of seconds Hazelcast waits before terminating its process (the same as terminationGracePeriodSeconds, but from the Hazelcast process perspective) After setting these parameters, your data is safe, and you can update the Hazelcast Docker image version and apply the new Kubernetes configuration. This will result in a successful Rolling Upgrade of your Hazelcast cluster. Hazelcast Graceful Shutdown We used the Hazelcast Graceful Shutdown, but how does it work under the hood? The main point of the graceful shutdown is to migrate all data replicas owned by the shutting-down member to the other running cluster members. After this process is complete, the shutting down member does not own any of the data (neither the main replicas nor the backup). You can imagine the Graceful Shutdown process as follows: Hazelcast member receives a signal to shut down It changes its state to SHUTTING_DOWN It sends information to the master member to start the data migration process It waits for the data partitions to be migrated (or until the deadline hazelcast.graceful.shutdown.max.wait is reached) It changes its state to SHUT_DOWN This way, we can be sure that when a member is going to shut down, it first transfers all of its data to other members. The last thing to mention is that using hazelcast.shutdownhook.policy=GRACEFUL is not the only way to shut down Hazelcast gracefully. The alternatives are: Method HazelcastInstance.shutdown() (if you use Hazelcast embedded in your JVM application) JMX API’s shutdown method “Shutdown Member” button in the Hazelcast Management Center application Now that we understand how the Graceful Shutdown procedure works, let’s come back to the main subject of this blog post, the Rolling Upgrade process. Rolling Upgrade by Minor Version (Enterprise Only) Hazelcast Enterprise enables Rolling Upgrades among minor versions. In other words, Hazelcast IMDG makes it possible to apply Rolling Upgrade only to patch versions, for example 3.12=>3.12.1, whereas Hazelcast IMDG Enterprise lets you upgrade 3.11.4=>3.12. This is a handy feature because you don’t have to stop your cluster ensure your Hazelcast always up-to-date. To use Rolling Upgrade by minor version requires setting one more JVM parameter (in JAVA_OPTS) to work automatically on Kubernetes. -Dhazelcast.cluster.version.auto.upgrade.enabled=true This is necessary because the Hazelcast cluster version is not updated by default. For example, we could start a cluster with the version 3.11, perform Rolling Upgrade to 3.12, and even though all members would be 3.12, the cluster would still use the 3.11 protocol. To prevent this from happening, the additional JVM parameter makes the cluster version upgrade automatically after the Rolling Upgrade procedure is complete. Rolling Upgrade Deployment As mentioned previously, the preferred method of deploying Hazelcast on Kubernetes is to use StatefulSet. The main reason for that is because using Deployment may (in some rare cases) start a Hazelcast cluster with a split brain (which recovers in a few minutes). Also, after all, Hazelcast is not a stateless service, but rather a database and these kinds of applications are usually deployed as StatefulSets in Kubernetes. Nevertheless, in some cases, your system architecture may require using Deployment, or you may use Hazelcast embedded in your (micro)services and for some reason, you need to deploy them using Deployment. In such case, you can still use Rolling Upgrade for Hazelcast, but you must remember about one crucial detail – by default Kubernetes does not perform Rolling Upgrade Pod-by-Pod, but instead keeps a certain percent of Pods alive. For example, if you have 10 Pods, then by default, Kubernetes will, all of a sudden, terminate 2 Pods (25%) and at the same time start 2 new Pods (without waiting for the old Pods to get terminated). If these 2 Pods store the data and backup for some data partition, then you may encounter a data loss. To prevent data loss in the Rolling Upgrade procedure for Deployment, you must ensure that no more Pods than the Hazelcast backup-count (1 by default) are terminated at the same time. To do it, you can add the following Kubernetes configuration: apiVersion: apps/v1 kind: Deployment spec: ... strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 maxSurge: 0 In this case, Kubernetes won’t terminate more than 1 Pod at the same time (maxUnavailable: 1) and it won’t start any new Pod (maxSurge: 0) before terminating the old one. Rolling Upgrade with Helm Chart The configurations mentioned above are already implemented in the Hazelcast Helm Charts. That is why if you use Helm to install your applications, you can start a Hazelcast cluster with the following command: $ helm install --name my-release --set image.tag=3.12 hazelcast/hazelcast Then, to perform the Rolling Upgrade, all you have to do is to change the image tag. $ helm upgrade my-release --set image.tag=3.12.1 hazelcast/hazelcast Then, everything happens automatically and you can enjoy the new Hazelcast cluster version without any downtime or data loss.