Hazelcast has clients for C++, .NET, Java, Python, and Node.js and has top-notch support. Although we also had a Go client, we lacked that level of support until the release of Go client v1 last year.

Hazelcast is an open-source memory-first application platform for stream processing and data-intensive workloads on-premises or as a cloud service for those unfamiliar with what we do.

No doubt, the biggest highlight of this release is support for Hazelcast 4 and 5. Hazelcast 4 has considerable performance improvements over Hazelcast 3, while Hazelcast 5 introduced several new features, such as a new SQL engine. We already support most of those excellent features and will roll out the rest in the coming releases.

Tailored for the Go Ecosystem

At Hazelcast, developer experience is one of our top priorities. We tailored the Go client to fit well in the Go ecosystem with this significant client release. Let’s go over the highlights.

Go Module Support

Module support was introduced in Go 1.11. Before it was introduced, standard tools did not support dependency management, and we had to depend on third-party tools, such as dep and glide.

Enabling go modules for your project is straightforward:

go mod init github.com/myorg/myproject

The command above uses github.com/myorg/myproject as the module name, but you can choose anything else. I highly recommend reviewing the following article for in-depth coverage of Go modules: Using Go Modules.

Once you have a Go module enabled project, adding the Hazelcast Go client to your project is trivial. You can depend on the latest release using:

go get github.com/hazelcast/hazelcast-go-client

Or depend on a specific release, v1.3.0 in this case:

go get github.com/hazelcast/[email protected]

After that, you can import the client in your code as usual:

import "github.com/hazelcast/hazelcast-go-client"

Go Context Support

One of the most wanted features for the new Go client was Go context support in the API. That’s not surprising, since contexts are indispensable when you need to chain calls to several APIs and have a way to break the chain anywhere.

Many articles explain how to use the context package, especially this one from the Go blog: https://blog.golang.org/context.

All Hazelcast Go client functions take context as the first parameter. For instance, we limit the Map set operation to complete in 500 milliseconds or less in the code below:

func mapSetWithTimeout(key, value interface{}) error {
    // create a context with timeout of 500 milliseconds
    ctx, cancel := context.WithTimeout(context.Background(), 500*time.Millisecond)
    // make sure cancel is called at least once
    defer cancel()
    return myMap.Set(ctx, key, value)

Later, we call the sample function:

err := mapSetWithTimeout("my-key", "my-value")
if errors.Is(err, context.ErrCanceled) {
    fmt.Println("context is canceled")
} else if errors.Is(err, context.Deadline) {
    fmt.Println("context deadline exceeded")
} else if err != nil {
    fmt.Println("some other error", err)
} else {

In case you don’t want to use a context with Hazelcast Go Client API functions, you can pass context.TODO(). It will act as a reminder to use the correct context later.

Overhauled Configuration

The default configuration works well when you run a Hazelcast Go client on the same machine with the Hazelcast server. That’s very common when testing the client on your laptop. We have the hazelcast.StartNewClient function, which uses the default configuration:

client, err := hazelcast.StartNewClient(context.TODO())

In all other cases, you would need to configure the client. The workflow is simple: you create the default configuration, update it as necessary, then pass it to hazelcast.StartNewClientWithConfig(ctx, config).

Most fields of hazelcast.Config struct are exported, and the default configuration of that struct is zero value. Here’s a sample:

config := hazelcast.Config{}
config.Cluster.Name = "production"
config.Cluster.ConnectionStrategy.Timeout = types.Duration(10*time.Second)
config.Cluster.Network.Addresses = []string{"100.200.300.401:5701", "100.200.300.402:5701"}
// alternatively:
// config.Cluster.Network.SetAddresses("100.200.300.401:5701", "100.200.300.402:5701")
config.Logger.Level = logger.ErrorLevel

This approach makes the configuration API more idiomatic. Also, configuration via JSON is a breeze, as seen below:

    "Cluster": {
        "Name": "production",
        "Network": {
            "Addresses": [
        "ConnectionStrategy": {
            "Timeout": "10s"
    "Logger": {
        "Level": "error"

YAML configuration is available using go-yaml too. Note that go-yaml requires fields to be written in lowercase, in contrast with the title case used in the JSON example:

    name: "production"
        - "100.200.300.401:5701"
        - "100.200.300.402:5701"
        timeout: "10s"
    level: "error"

Unsurprisingly, configuration via TOML works well too:

name = "production"

addresses = [

timeout = "10s"

level = "error"

In the programmatic configuration, you’ll notice that connection timeout is given by casting time. Duration to types.Duration. To be able to use human-readable names, like “10s” in JSON/YAML, etc., we had to create another type for durations in the configuration.

We included some convenience methods to modify the configuration, such as config.Cluster.Network.SetAddresses, which is equivalent to setting config.Cluster.Network.Addresses.

config.Cluster.Network.SetAddresses("100.200.300.401:5701", "100.200.300.402:5701")

Note that some configuration values cannot be marshalled/unmarshalled, such as Portable factories. Those kinds of settings are only accessible using methods.


Idiomatic Errors

Go errors are simple. In many cases, all one needs to do is return errors.New(“some problem”). If the error should be part of the API, it can be assigned a name:

var ErrSomeProblem = errors.New("some problem")

Errors are created with errors. New are unique, even if they have the same message. So, in your code, you can check for that particular error by using a simple if statement:

if err == ErrSomeProblem { ... }

Error handling gets “interesting” when it should contain more information about the error, such as why it happened. Of course, you can include that information in the error message. But then you can’t assign a name to it, and you won’t be able to check it simply using an if statement. You can always create a new error type (and sometimes you should), but creating a new one just to distinguish between different error types gets tedious.

Fortunately, Go 1.13 introduced a better solution: error wrapping/unwrapping: Working with Errors in Go 1.13 Using the errors. In function, it is straightforward to match the mistakes even if it is wrapped inside other errors. Every error returned from Go Client v1 API supports that usage.

Here’s an example. Index attributes should not have dot prefixes, and trying to use such an attribute with a map.AddIndex results in hzerrors.ErrIllegalArgument:

ic := types.IndexConfig{Attributes: []string{"Foo."}}
ctx := context.Background()
client, _ := hazelcast.StartNewClient(ctx)
m, _ := client.GetMap(ctx, "my-map")
err := m.AddIndex(ctx, ic)
// Output: attribute name cannot end with dot: Foo.: illegal argument error
// Output: true
fmt.Println(errors.Is(err, hzerrors.ErrIllegalArgument))

The error contains an excellent explanation about the error, and also it is straightforward to check whether it matches a known error.

You can check out https://github.com/hazelcast/hazelcast-go-client/blob/master/hzerrors/errors.go for the list of possible error values.

Even though the error mechanism used internally in the client is more complex, the errors package enabled us to deliver a convenient API for our users.

Other Enhancements

External Smart Client Discovery

The client sends requests directly to cluster members in the intelligent client mode (default) to reduce hops to accomplish operations. Because of that, the client should know the cluster members’ addresses.

Unisocket Mode
Unisocket Mode
Smart Mode
Smart Mode


In cloud-like environments or Kubernetes, there are usually two network interfaces, the private facing one and the public-facing one. When the client is in the same network as the members, it uses their private network addresses. Otherwise, if the client and the members are on different networks, the client cannot connect to members using their private network addresses. Hazelcast 4.2 introduced External Smart Client Discovery to solve that issue.

To use this feature, make sure your cluster members are accessible from the network the client resides in, then set config.Cluster.Discovery.UsePublicIP to true. You should specify the address of at least one member in the configuration:

config := hazelcast.Config{}
// public address of the member
config.Cluster.Discovery.UsePublicIP = true

This solution works everywhere without further configuration: Kubernetes, AWS, GCP, Azure, etc. as long as the corresponding discovery method is enabled in the Hazelcast server configuration.

Map Locks

Hazelcast provides pessimistic lock support for Maps. With Hazelcast Go client v1, we fully support this API.

When an entry is locked, only the owner of that lock can access that entry in the cluster until it is unlocked by the owner of force unlocked.

See https://docs.hazelcast.com/imdg/latest/data-structures/map.html#locking-maps for details.

Locks are reentrant. A lock owner can acquire the lock again without waiting for the lock to be unlocked. If the key is locked N times, it should be unlocked N times before another goroutine or another client in the cluster can acquire it.

Lock ownership in Hazelcast Go Client is explicit. The first step to owning a lock is creating a lock context, similar to a key. The lock context is a regular context. Context carries a special value that uniquely identifies the lock context in the cluster. Once the lock context is created, it can be used to lock/unlock entries and used with any function that is lock aware, such as Set.

m, err := client.GetMap(ctx, "my-map")
// create the unique lock context
lockCtx := m.NewLockContext(ctx)
// acquire the lock, blocks until it is acquired
err = m.Lock(lockCtx, "some-key")
// pass lock context to use the locked entry
err = m.Set(lockCtx, "some-key", "some-value")
// release the lock once done with it, so "some-key" is available for others
err = m.Unlock(lockCtx, "some-key")

Check out the lock example. In that example, a map value is incremented by several goroutines using a map lock to prevent data races.

As mentioned before, lock context is a regular context. The context that carries a unique lock ID. You can pass any context. Context to any Map function that takes a lock context. But in that case, the same lock ID is used for those operations, and there is no way to enforce lock ownership when using the same client (lock ownership is still maintained in the cluster). To prevent data races, consider always using the lock context for map operations when using multiple goroutines with the same client.

Conclusions and Pointers

This is a significant milestone for the Hazelcast Go client. But that’s not all: there are more features we didn’t cover in this article, like Near Cache and SQL. We have upcoming articles about those features.

Here are a couple of links you may find helpful:

Your feedback is valuable for us to provide the features you want and fix any pain points. Reach out to us on GitHub or Slack for anything related to Hazelcast and the Go client.

For this blog post, we would like to give special recognition to Software Engineer Intern Mehmet Tokgöz, who was instrumental in this project and the write-up.

We previously explained how to get started with the Hazelcast Python client. In this tutorial, we will show how to use the Python client in Jupyter Notebook. This notebook demonstrates the SQL support of the Hazelcast Python Client. Hazelcast provides in-depth SQL support for Map data structures kept in Hazelcast Clusters. Using Hazelcast SQL support, you can create mappings between your data and a database table and execute SQL queries on the Map. This support provides fast in-memory computing using SQL without writing complex functions that iterate through your maps. Through this tutorial, you can either use your local cluster or Hazelcast Viridian. We will use Hazelcast Viridian as our cluster provider to not worry about setup or installation. Hazelcast Viridian Serverless offers free registration with 2GiB of storage. Remember, you can run this notebook in a Google Colaboratory environment without dealing with local installations.

We are looking forward to your feedback and comments about this blog post. Don’t hesitate to share your experience with us in our community Slack or Github repository.

Why do you want to run Hazelcast in the Jupyter Notebook

The Jupyter Notebook offers an all-in-one solution with a web-based interactive environment (in runs in the web browser) that you can use to combine Python code, formatted text, animated images and graphs, videos, mathematical equations, plots, maps, and interactive figures, all in a single document. It is also easy to share and collaborate on as it uses structured text formats.  It offers various extensions for various data science and ML projects. On the other hand, Hazelcast provides a simple solution to quickly evaluate and process your data as a real-time data processing platform. Using the Hazelcast SQL engine, you can skip all the details and directly work on the value of your customer’s data. You can infer much information without dealing with hundreds of line codes and slow executions.

Setup Section

Notebook version and setup

For you to run the notebook file locally on your computer, you must install Jupyter Notebook and follow its installation guide. We used Notebook version 6.4.12, but it’s not important for our setup, as long as you have an up-to-date version. Instead of installing Jupyter Notebook locally, you can use the Google Colaboratory environment, which is a virtual machine to execute IPython notebook files. When you open the notebook link, you can create a copy of the notebook file to your Google Drive and work on a virtual machine created for you. Another advantage of Colaboratory is not having to worry about package installations, they are all bound to this virtual environment only. We will explain the Hazelcast setup later, but know that Google Colabrotary uses Hazelcast Viridian cloud service for connection since Hazelcast is not installed on Google’s servers. If you have installed Hazelcast and Jupyter Notebook locally before, we recommend running the notebook on your local machine for better CPU performance. If you are new to Hazelcast and trying to learn, use Google Colaboratory and Hazelcast Viridian Serverless to skip all local installation details.

API setup and how to replace with a different API

Instead of manually inserting data, we prefer to pull them over an API and simulate real-time use cases of Hazelcast. We have used The Movie Database (TMDB) API to pull movie and actor data. TMDB and most API providers ask for an API key to validate and accept the incoming requests to their servers. So, you need to create an account from its website and go to Settings > API > Create a new API key section. It may ask you some questions about your project, short answers like “Experimenting API requests” is enough. Hazelcast has placed a form section for the API key at the beginning of the file to paste your API key into this section. The same process applies to other API providers too. You can use other API sources for the notebook instead of TMDB API. You can change the endpoint URLs under the “Load Data From API” section to load data from different APIs. Please keep in mind that all other queries and mappings are configured for the movie scenario. Of course, feel free to change them according to your API source and experiment with the different scenarios via the easy-to-use interface of Hazelcast.

Hazelcast Viridian Serverless setup

To use one of six Hazelcast clients, you need to have a running Hazelcast cluster instance. You can either install Hazelcast locally and run a cluster on your localhost or connect to your Viridian Serverless cluster. To connect your local cluster, you must remove the config options for hazelcast.HazelcastClient(…) call inside the Connect To Hazelcast Cluster section. In this case, it tries to connect to the localhost. Alternatively, Hazelcast Viridian Serverless is our service to provide running Hazelcast clusters in the cloud without dealing with any local installations. You can create an account and deploy a cluster with up to 2GiB of storage for free. After creating a cluster, please select the created cluster and go to Connect Client > Advanced Setup section. You will see the cluster name, discovery token and SSL password for your Hazelcast Viridian cluster. Please run the “Hazelcast Viridian Authentication Tokens” cell to enter your tokens, which will open some text boxes for you to paste the token. Since Hazelcast Viridian Serverless requires a secure connection, it will also ask you to select the ZIP file that contains the SSL certificates generated for your cluster. Please select it from the opening windows when you run the cell. 




SQL queries

Now, we have all the ingredients to use Hazelcast functionalities. Hazelcast provides exclusive support for querying your traditional map entries using SQL syntax. To make this possible, we need to create a mapping between your data and a map. These mapping queries are under Create Mapping between Map and Table section. We inserted our data as HazelcastJsonValue, which is our serialization method for JSON objects. We can directly refer to these JSON fields during the mapping and assign them as table columns. There is no restriction on mapping, you don’t have to select all the fields. After creating the mappings for your maps, you can execute SQL queries on them using Hazelcast SQL functions. Most of the SQL features and functions are available in Hazelcast. Using Hazelcast, you can skip chaotic functions and use SQL syntax to easily search data on your map. We have provided some sample queries for you under the “Fun Part: SQL queries” section. Feel free to change them as you want and see the usage of Hazelcast. Since it is an IPython notebook, you can run cells repeatedly without having to call the previous cells. 

query = """

    SELECT m.title AS name

    FROM movies m

    WHERE m.vote_count > 20000 AND m.vote_average > 7 AND m.release_date < '2015-01-01'

    ORDER BY m.popularity DESC


result = client.sql.execute(query).result()

for row in result:



In this tutorial, we explained how to use a Python client in Jupyter Notebook. This notebook demonstrates the SQL support of the Hazelcast Python Client. Hazelcast provides in-depth SQL support for your distributed Map data structures kept in Hazelcast clusters. Using Hazelcast SQL support, you can create mappings between your data and a database table and execute SQL queries on the Map. We then used Hazelcast Viridian Serverless as our cluster provider to simplify installation and setup or installation. Don’t hesitate to share your experience in our community Slack or Github repository.

Finally, we would like to acknowledge our Software Engineer Intern, Mehmet Tokgöz, for his input on this project.

Notebook link

For local file version: https://github.com/mehmettokgoz/hazelcast-python-sql-notebook

Google Colaboratory environment: https://colab.research.google.com/drive/1ujUt_XJI2moWSWMcF5_MPiWPg4LCJuot?usp=sharing


Database transactions often underpin online business transactions. The database transactions are the brakes if we compare business transactions to race cars. Just as the fastest driver is the one that uses the brakes the least, the quickest business transactions are the ones that depend on database transactions the least.

And, of course, you can’t eliminate brakes, as you still need to worry about safety. So, the goal is to use the right amount of braking (or, in our case, the right amount of database transactions).

This blog post will examine how your application can be faster without sacrificing safety.

Action and transaction

We will follow a familiar scenario, moving money between bank accounts.

The “Action” is the business event, the logical view.

Move $10 from account A to account Z.

The “Transaction” is the technical event, the implementation. This may include commit or rollback, be ACID or BASE, and is a way to ensure correctness.

The coding choice

The outcome of the action is that account A has $10 less, and account Z has $10 more.

A transaction is required if two conditions hold.

  1. It is implemented as a two-step operation.
  2. You wish it to appear as a one-step operation. (Atomicity)

Condition 1 is just the choice of book-keeping method.

Condition 2 is usually described as a requirement, but it may just be a wish.

Let’s review the alternatives.

Single-entry and double-entry accounting

As background, consider the classic accounting systems.

Single-entry dates to 3000BC; double-entry is newer, from 1494AD.

The single-entry system might have this list of actions for $50 deposits and the $10 transfer from above.

Item 1 : Account A : $50 deposit
Item 2 : Account Z : $50 deposit
Item 3 : Account A : $10 transfer : to : Account Z

In the double-entry system, actions are recorded twice. Each account has its list of actions.

Account A:
Item 1 : $50 deposit
Item 2 : $10 transfer : to : Account Z


Account Z:
Item 1 : $50 deposit
Item 2 : $10 transfer : from : Account A

Double-entry is easier for humans as the volume of actions increases. Computer systems typically mirror business processes, and the implementation where one action requires two data updates naturally follows.

Implementation 1 – double-entry with transactions

Following from above, what is wrong with the classic approach, the ACID transaction?

Here it would be:

Start transaction.
Decrement account A balance by $10.
Increment account Z balance by $10.

It’s an all-or-nothing approach and is very appealing.

Both accounts show their previous balance; then both show the new balance.

In the rare event of some IT crash, the transaction is rolled back.

So let’s review two things that are wrong with this approach.


Update access to account A and account Z needs to be suspended for everyone else for the duration of updating both data records.

If you were to code this yourself, you’d use locks. Locks stop other processing, hence impact on application speed.

Nothing else may be trying to update accounts A or Z at this time, so you might think nothing is delayed. But there is still the time cost to lock and unlock.

A transaction essentially just locks, handled for you.

Correctness & Isolation

Imagine while we ran the above transaction that someone else ran a query summing account balances, a scan of all account records.

First, you might start your transaction.

Then the query might obtain the balance of account A. Your transaction is incomplete, so the query gets the old value for account A ($50).

Then your transaction completes successfully.

Then the query might obtain the balance of account Z. Your transaction is complete, so the query gets the new value for account Z ($60).

So the query returns $110 instead of $100, even though your update was transactional.

Here the transaction has “write” isolation. Both writes happened atomically, so the transaction has correctness. But a concurrent read from another place has incorrectness. “Read” isolation would stop the concurrent read while the transaction runs, meaning the data can only have one user at a time, which is unacceptable.

There are many other such scenarios exposing logical flaws in transactions.

Implementation 2 – double-entry without transactions

Imagine we removed the transaction wrapper from the above, so two independent updates to accounts for the one action.

How do we win? What do we lose?

We win obviously on speed. There are now no locks needed.

Concurrent queries are no better or worse than before. The query may still return $100 or $110 as a race condition.

What we think we have lost is guaranteed consistency, but have we?

Guaranteed Consistency

The worry in the above scenario is account A is updated, and the system fails, so account B isn’t updated.

A transaction stops this, despite its other problems, as already noted.

What a transaction guarantees are immediate consistency.

Eventual consistency will frequently be acceptable.

We would expect failures to be rare. We would expect to know about them promptly. So on any such failure, we just run a one-off process to complete any half-done business action.

Implementation 3 – single-entry

The third approach would be to implement single-entry accounting, as machines can handle this at scale even if humans can’t.

Reviewing what we saw before, this is just an event journal!

Item 1 : Account A : $50 deposit
Item 2 : Account Z : $50 deposit
Item 3 : Account A : $10 transfer : to : Account Z

Each line item is a single line, either written or not.

We have consistency and no need for locks.

If we wish to know the current balance of account A, it’s just a query against the event journal. But if there is a lot of data, this query may take enough time to run that it is noticeable to the human eye. So we might instead go with a materialized view.

What is stored?

To review, what is stored with the different approaches?

We will store records for each account and all their transactions.

For double-entry, the account holds the current balance.

For single-entry, the account may not hold the current balance; we might calculate it when needed from the transactions.

Or, for single-entry, we might refresh the account with a balance using a materialized view.

Materializing a view

For our single entry, we might refresh the balance continuously or periodically.

A stream processing job could observe the event journal. When a new action is written, the affected accounts can be updated.

Or, a scheduled task could scan the event journal to do a similar thing.

For either mechanism, we need to consider failure. If we replay events, we need to know whether or not to apply them. Events are sequential, so this is as simple as recording the last sequence number that the balance relates to.

Consistency once again

In many of the approaches above, when the action has been applied, the balances for account A and account Z do not update at precisely the same time. One is soon after the other.

From a bank customer’s perspective, this is fine.

It wouldn’t be unknown for account A and Z owners to know each other since one sends money to the other. Account A’s owner would see the cash remaining. If account Z’s owner doesn’t see it arrive immediately but does see the funds come pretty soon, that’s ok. More than a few minutes or even seconds would be poor by modern standards.

Reconciliation is the safety net. A process or process applies the actions to the accounts. If they fail, we can rerun them. But it’s only software, so there should always be distrust. A diligent bank will have cross-checks running anyway to ensure everything has been applied at least by the end of the working day.


Transactions slow down processing. Transactions do not ensure correctness in the broader sense. They run correctly but do not guarantee that others do not see inconsistent data.

If you choose a single-entry approach, you don’t need transactions. If you choose a double-entry approach, you might or might not need transactions. It depends on the consistency model you can agree upon with the business user.

In Hazelcast Platform 5.1 contains two querying APIs:

  • Predicate API
  • SQL engine

Predicate API is an older Java-based API. Even though it contains sqlPredicate, which allows using SQL-like syntax for the WHERE clause, the syntax is non-standard, for example NULL handling doesn’t support the (in)famous ternary logic. It fetches the results in one batch, which limits the  supported result size.

On the other hand, SQL Engine is a more modern engine, it uses standard SQL, a cost-based optimizer, and is available in all programming languages. It also supports JOIN, ORDER BY, GROUP BY or UNION operators, which don’t have an equivalent in the Predicate API. It streams the results to the client, so the result size isn’t limited (though this is also a restriction, because it’s not possible to restart the query if it fails mid-way).

In the next major release, we plan to deprecate the Predicate API. For this, we need feature parity and match Predicate API’s performance. The performance is the focus of this blog post: we’ll describe our journey in benchmarking and fixing some performance issues we had.

Benchmark details

Here is a list of benchmarks that we did for that comparison:

SQL Predicate API
1 SELECT __key, this FROM iMap map.entrySet()
2 SELECT count(*) FROM iMap SELECT count(__key) FROM iMap map.aggregate(Aggregators.count())
3 SELECT sum(\"value\") FROM iMap map.aggregate(Aggregators.longSum("value"))
4 SELECT sum(JSON_VALUE(this, '$.value' RETURNING INTEGER)) FROM iMap map.aggregate(Aggregators.longSum("value"))
5 SELECT __key, this FROM iMap WHERE \"value\" = ? map.entrySet(Predicates.equal("value", valueMatch))

The fifth benchmark was run with and without index on value field. For benchmark number three, we used three different serialization methods:

  • IdentifiedDataSerializable
  • Portable
  • HazelcastJsonValue with json-flat SQL mapping

For the rest of the benchmarks, we used IdentifiedDataSerializable. If you are not familiar with our serialization options check this page.

We didn’t benchmark joins and streaming queries since they cannot be executed with Predicate API. These functionalities are available in the SQL engine only.

Testing environment

All the benchmarks were run in throughput and latency modes since we cared about both. In this post, I will cover only the latency part of those benchmarks.

The first results showed that the SQL was slower in most of them. The next step was to rerun those benchmarks with the attached profiler. I used Async-profiler, and since I was tracing latency issues, I chose wall-clock mode. You can read about different modes in my previous post.

All of those benchmarks were run in our testing lab, so we knew that the results were valid (no noisy-neighbor issue). The setup for the models was:

  • Hazelcast cluster size: 4
  • Machines with Intel Xeon CPU E5-2687W
  • Heap size: 10 GB
  • JDK17

How does Hazelcast SQL work?

Each of the cluster members contains two significant modules: IMDG and JET. IMDG is a module where we store our data. JET is a distributed batch and stream processing engine. Additionally, for parsing the SQL queries we use Apache Calcite.

When a client sends a query to a cluster, that query is just a string with parameters. The query is sent to a random member of the cluster. That member is called a coordinator of that query. The coordinator needs to:

  • Create a query plan
  • Convert the query plan to the JET job
  • Distribute the JET job to all the members

After that is done, all the members execute the query, and the coordinator to the client streams the results.

Flame graphs

If you do sampling profiling you need to visualize the results. The results are nothing more than a set of stactraces. My favorite way of visualization is a flame graph. The easiest way to understand what flame graphs are is to know how they are created.

First part is to draw a rectangle for each frame of each stactrace. The stactraces are drawn bottom-up and sorted alphabetically. For example, such a graph:

corresponds to set of stactraces:

  • 3 samples – a() -> h()
  • 5 samples – b() -> d() -> e() -> f()
  • 2 samples – b() -> d() -> e() -> g()
  • 2 samples – b() -> d()
  • 2 samples – c()

The next step is joining the rectangles with the same method name to one bar:

The flame graph usually shows you how your application utilizes some resource. The resource is used by the top methods of that graph (visualized with a green bar):

So in this example, method b() is not utilizing the resource at all; it just invokes methods that do it. Flame graphs are commonly used to present the CPU utilization, but the CPU is just one of the resources that we can visualize this way. If you use wall-clock mode then your resource is time. If you use allocation mode then your resource is a heap.

Client part

Benchmark details:

  • Query: SELECT __key, this FROM IMap
  • Values in IMap have fields: String and int[20], serialized with IdentifiedDataSerializable
  • IMap size – 100_000 entries
  • Latency test – 24 queries per second

Let’s start with profiling the client side. Here is a wall-clock flame graph of the method that fetches the data on the client’s side:

Let’s look at the bottom of the flame graph, method testSelect() iterates over the results and for each row, it executes three things:

  • Method accept() – that executes the deserialization of the row – each SQL row travels to the client through the network as a byte array, so we need to deserialize it to the object format
  • Method ClientIterator.hasNext() – that one returns the next row if it is already fetched or waits to receive the next portion of the result set. In the flame graph, we see the waiting part. At this point the execution is done in the cluster – nothing to improve on the client’s side.
  • Some third method, let’s zoom in on that part of the flame graph

Here we can see that the method getColumnValueForClient() (3rd from the top) executes LinkedList.get() which consumes a lot of time. Let’s go to the source code:

columns is a field of type List<List<?>>, and the problematic method executes columns.get(columnIndex).get(rowIndex). Executing LinkedList.get(index) is known to have a bad performance, since it has O(size) complexity. The instance is not created here; it is passed in the constructor, so we need to debug the application to find the origin of that LinkedList. Two breakpoints later, I found a place where it was created. One solution is to switch that List to ArrayList like this:

But it is not always that easy. The code where that list was created was done many years ago and is used not only by the SQL engine. We need to be careful here. I asked colleagues from the other team if we could do it, and they did their benchmarks and agreed to that change:

How did that affect our performance? Well, it depends on the number of columns and whether we do deserialization on the client’s side (it is done lazily). Here is the latency distribution for two columns and no deserialization:

Was it a mistake to choose LinkedList in that code? No. It was a coherent decision made by the state of Hazelcast at the moment of creation of that code.

The thing to remember is that performance is a living organism. It evolves with new features. What was efficient yesterday could be a bottleneck tomorrow.

PR: https://github.com/hazelcast/hazelcast/pull/20398

Cluster side – sum()

Benchmark details:

  • Query: SELECT sum(”value”) FROM IMap
  • Values in IMap have fields: one Long, and one in[20], serialized with IdentifiedDataSerializable
  • IMap size – 1_000_000 entries
  • Latency test – 6 queries per second

In this benchmark, we need to deserialize all the instances in IMap, get a single field from that deserialized object, and aggregate it. The majority of work is deserialization. It was strange to me that the Predicate API performed differently than SQL.

Here is a flame graph for Predicate API:

And here is one for SQL:

Now let’s do something a bit silly. Open both graphs in separate browser tabs (right mouse click on the graph -> open graphics in a new tab) and then just switch between those two tabs very quickly and try to find a difference. Did you spot it? It is at the top. The SQL graph has a lot of yellow/red code there; the Predicate API does not. Let’s see what it is about:

The frames that we can see in SQL only are Class.forName(). We already had a class definition cache in Hazelcast, and it was strange that it wasn’t used for SQL. We can highlight all invocations of that method to see how much time it consumes:

So loading of classes consumed ~25% of the deserialization of Java objects. Let’s look at the code of our class loading:

We have four layers of caches here; each tryLoadClass() checks if the definition is in the cache; if not it loads it. If none of the classloaders loaded the class we called Class.forName().

The difference between Predicate API and SQL was that the contextClassLoader was null in SQL since you cannot use user classes with SQL queries. We decided to add another level of cache:

How did that improve the performance?

The thing to remember is that “If it looks stupid but works, it isn’t stupid” – that also applies to the performance analysis. Comparing two flame graphs by quickly switching the browser tabs looks silly, but hey, it works like a charm. If you want to compare resource utilization, that should look the same.

PR: https://github.com/hazelcast/hazelcast/pull/20459

Cluster side – sum() with JSON

Benchmark details:

  • Query: SELECT sum(json_value(this, '$.value' returning integer)) FROM cache
  • Values in IMap have two fields: Long and int[20], serialized as JSON
  • IMap size – 1_000_000 entries
  • Latency test – 5 queries per second

Let’s go straight to the flame graph of evaluation of a value from JSON:

The big left part of that graph is acquiring a lock; it utilizes:

~70% of the time of evaluation of the value. That part of the code uses a Guava cache that uses ReentrantLock internally. The cache contains a mapping from the string JSON path to our object representing that path. The usage of that cache is usually single insert and multiple reads. For such a usage pattern, the ReentrantLock is not the best choice, ConcurrentHashMap is better, for example. A simple switch from Guava cache to CHM did this improvement:

In the end, we decided to do two implementations of a cache, one for a single JSON path based on a field, and the second for the rest of the cases.

The thing to remember is that the performance bottleneck may be in 3rd party libraries/frameworks. It doesn’t mean that their code is terrible. Usually we do not know the trade-offs there, and that may hurt us.

PR: https://github.com/hazelcast/hazelcast/pull/20655

Cluster side – scan for a single value by index

Benchmark details:

  • Query: SELECT __key, this FROM iMap where col=...; // Index on col
  • Values in IMap have two fields: String and int[20], serialized with IdentifiedDataSerializable
  • IMap size – 10_000_000 entries
  • Latency test – 7500 queries per second

This is a benchmark where we test a distributed deployment overhead of a job on a busy cluster. The part after the deployment is easy; we just need to take an index and fetch a single row from it. Unfortunately, in such queries, we will never be better than Predicate API since the execution plan for SQL is much bigger than for Predicate API, and serialization and deployment takes more time. What we can do is to speed up that part in SQL to be as fast as possible.

When I ran that benchmark’s first instance, the mean latency was around 4,5ms on a stressed cluster. To fight that kind of latency, we need to focus on all the resources that are required for that code to run. Let’s look at the heap allocation:

Over 90% of the recorded allocation (during the creation of the plan) was done in the constructor of SenderTasklet. That class is responsible for sending computation results to other nodes in the cluster. It created a 32k-byte array as a buffer of data to send. The value 32k was ok for a task manipulating multiple entries, but it was a waste of heap memory for a job that processes only one row.

We went with the solution to have a buffer with two initial sizes:

  • Small initial size: 1k
  • If that buffer is too small, it expands to 32k immediately

That approach didn’t hit larger tasks, they need to waste a 1k array at the beginning, but the allocation of such an array is much cheaper than 32k since 32k is often allocated outside the TLAB.

That change lowered the allocation rate of that benchmark from 3,8GB/s to 2,3GB/s, but that was not the only change in that part of our engine. My colleague pointed out that in some cases, we created a SenderTasklet that knew it would never send any data through the network. We can avoid the creation of unnecessary SenderTasklet. That change lowered the heap allocation rate to 1,5GB/s.

There were plenty of PRs for lowering that deployment overhead. The current status (mean) is:

We lowered the mean latency from 4.5ms to 1.8ms on a stressed cluster, and we still have ideas for improving it.

The thing to remember is that allocation on the heap is fast, but no allocation is more immediate.

PRs: https://github.com/hazelcast/hazelcast/pull/20882 and https://github.com/hazelcast/hazelcast/pull/20940


Our SQL engine in release 5.2 is currently faster with better throughput than Predicate API in most of the benchmarks. The only kind of query where Predicate API is more immediate is a query that usually takes just milliseconds to execute. As I mentioned in the last example, we still have ideas on how to make the SQL engine faster. We are going to realize those in future releases.

In this post, I focused on four different issues that can teach us something. Let me gather those four things to remember:

  • Performance is a living organism. It evolves with new features. What was efficient yesterday could be a bottleneck tomorrow.
  • “If it looks stupid but works, it isn’t stupid” – every flame graph analysis technique is good as long as it works for you
  • The performance bottleneck may be hidden in 3rd party libraries/frameworks
  • Allocation on the heap is fast, but no allocation is faster

One additional “lesson learned” is that the Async-profiler is a handy tool for finding such high-level bottlenecks. You need to remember that there are issues in which such a profiler won’t help. Some performance bottlenecks can be understood after analyzing assembly code or with Top-down Microarchitecture Analysis.

The Hazelcast .NET client provides essential features for scaling .NET applications when speed and performance are needed. We previously discussed how to get started with the Hazelcast .NET client. This blog post will demonstrate how to process real-time streams using the Hazelcast .NET client, SQL, and Kafka. In order to explain our demo and setup, we will assume that you have already installed Hazelcast and run it locally, as well as running the Kafka server and ZooKeeper.  The following diagram explains our demo setup; we have a Kafka topic called trades which contains a collection of trades that will be ingested into a Hazelcast cluster. Additionally, a companies map represents companies’ data stored in the Hazelcast cluster. We create a new map by aggregating trades and companies into ingest_trades map.


Demo Source

You can access the demo source code at: https://github.com/zpqrtbnk/hazelcast-dotnetstockdemo

Demo Setup

The application demo has the following steps:

  • Define a Hazelcast map named companies and contain static data about companies (such as their ticker, name, market capitalization, etc.)
  • Define a Hazelcast mapping named trades and targeting a Kafka topic
  • Define a Hazelcast JET job that runs a SQL query joining trades and companies and feeding the result into a Hazelcast map named trades_map
  • Queries that trades_map periodically and update a web UI via SignalR

The trades Kafka mapping is a streaming mapping, which means that a SQL query over that mapping does not complete but instead returns new rows as they come. Therefore, the JET job keeps running and constantly inserts rows into the trades_map as rows are received from Kafka. In real-life, rows would be pushed to the Kafka topic by a totally independent service. However, for the sake of the demonstration, a background task pushes rows to Kafka periodically. This is the only reason for the demo to access Kafka: a typical application would only access Hazelcast, and Hazelcast itself does all the Kafka-related work. From an application perspective, the demo is a background worker within a .NET 6 MVC application. The worker queries the trades_map periodically and pushes updates to a SignalR hub, which refreshes the UI in real-time. The only thing that is not true real-time is the trades_map query, as Hazelcast maps are batch SQL sources: a SQL query over a map returns the requested rows and completes; it does not run continuously as the query of the Kafka mapping does.

Packaging the Demo

Although the demo can run on the local host (via a simple dotnet run command), we have decided to bundle it as a Docker image and run it as a container. For that purpose, a Dockerfile file is at the demo directory’s root. The demo image has been pushed to the public Docker repository. This is all achieved through the following commands (where zpqrtbnk is the Docker user name):

docker build -t zpqrtbnk/dotnetstockdemo -f Dockerfile .

docker tag zpqrtbnk/dotnetstockdemo:latest zpqrtbnk/dotnetstockdemo:1.0.0

docker login --username=zpqrtbnk

docker push zpqrtbnk/dotnetstockdemo:latest

docker logout

Running the Demo

The demo is entirely composed of Docker containers because that is the simplest way to run Kafka and Hazelcast. The actual demo could run on a local machine but again, for the sake of simplicity, we run it in a Docker container. The containers are operated via the docker-compose tool, which can create a complete set of containers from one single YAML description file. The file is docker-compose.yml at the root of the demo directory for this demo.

Kafka Service

Kafka requires a pair of services to operate: the ZooKeeper service and the Kafka broker itself. Their configuration is entirely defined in docker-compose.yml and mainly consists in defining the networking setup. ZooKeeper runs on port 2181 within the Docker network (not exposed to the host). Kafka operates on port 29092 within the Docker network (not exposed to the host) and on port 9092 which can be exposed to the host. Note: when port 9092 is exposed to the host, the KAFKA_ADVERTISED_LISTENERS line in docker-compose.yml must be updated so that the EXTERNAL listener points to the actual IP of the host.

Hazelcast Service

Hazelcast is one service that operates on port 5701 within the Docker network (and can be exposed to the host). Its configuration is entirely defined in docker-compose.yml and consists in determining the Hazelcast cluster name for the cluster via an environment variable.

Demo Service

The demo itself is one single service. Its configuration is entirely defined in docker-compose.yml. It consists in:

  • Mapping port 7001 (the demo port) from host to service
  • Defining some environment variables which specify how to reach the Kafka and Hazelcast services within the Docker network

The demo service will connect to the Kafka and Hazelcast services within the Docker network and serve a web UI on port 7001.

Running the Demo

The complete demo can be started by running the following command on a Docker host within the directory containing the docker-compose.yml file.

docker-compose up -d

And then directing a browser to http://<host>:7001/ where <host> is the name of the host that is running the Docker containers.


Note that docker-compose supports various options such as stop and down to stop everything, ps to list running containers, etc. In some rare cases, the demo can hang due to timing issues when starting the containers. It is possible to fix this by restarting the demo container with docker-compose restart dotnetstockdemo. Note: this means that any user can reproduce the demo by simply fetching the docker-compose.yml file, adjusting settings in the file, and using docker-compose.

Running on AWS

The demo currently runs on AWS, using an EC2 Linux instance as a Docker host. This means that we connect to the instance via SSH, copy the docker-compose.yml file, and run the docker-compose commands there. There is a way to run docker-compose locally, target AWS ECS (Elastic Container Service) and even possibly run the contains on AWS Fargate, i.e., without any EC2 instance. However, this first requires a complex configuration on the AWS side (in terms of permissions and networking…) and we have not used it for the demo.

Next steps

If you are interested, you can change the Kafka source to any source; even if not implemented in Hazelcast, you can create your own source. This is also applicable to Hazelcast sinks as well. Furthermore, we only demonstrated combining the Kafka stream with IMap but it is possible to do complex SQL queries or create advanced data pipelines.



This demo demonstrated how you could connect your .NET application to Hazelcast using the .NET client. The application ingests trades from Kafka topic into Hazelcast and aggregates the trades into a new IMap. The aggregation results are sent back to the .NET clients and displayed in the web browser. It is possible to change your input to any of the following sources and your output to any of the following sinks.  If you found this helpful post, or you have any feedback or question about the setup or demo, you can reach out to us at [email protected]

Many Hazelcast customers have copies of their data in multiple data centers. This strategy is commonly used for business continuity should the data center suffer an outage. However, there are also geographic reasons should the business have geo-specific data needs (e.g., for users) in different continents and to ensure consistent performance by storing the data as close to the customer as possible.

This blog post will look at how this is managed for applications working with the data.


The crucial part of this is that data centers must be far apart. Regarding geographic reasons, this is pretty obvious, but less so for business continuity. Indeed, there would be several advantages to having them close together — for example, one team could look after both for mechanical tasks like plugging in new hardware. Ultimately, business continuity will dictate, now or in the future, that adjacent data centers are exposed to catastrophic events such as floods.

Although the exact distance is unspecified, in the likes of ISO 27001, a separation above 100 miles / 160 kilometers is a standard guideline.

Why not span data centers?

Clustered applications, such as Hazelcast, need fast communications to operate. Under a millisecond from point to point, constantly. This is unlikely to be achievable over the longer distances mandated above. With two such data centers, there need to be two Hazelcast clusters (two copies of the data) rather than one large Hazelcast cluster spanning the two data centers.

Clients not embedded

The most helpful topology here is the client-server model. Applications are clients of the Hazelcast cluster. They connect to a cluster to load and save data in the same style as connecting to a database. Should that first cluster go offline, it can connect to a different cluster. Users of that application, whether people or other applications are unaware of the change in the data source.

The alternative topology is server only. Here, applications run in the same process as Hazelcast. So if the process goes offline, the application goes offline as well as Hazelcast. Users of that application, people, or other applications are impacted and need to be diverted.

These topologies may be mixed. Streaming analytics run in Hazelcast servers, embedded is appropriate for purely event-driven workload.

Red/black and blue/green

Red/black and blue/green color pairings are frequently used to describe cutover models.

In red/black cutover is total. All workload is sent to one cluster, with the other cluster sitting in reserve. Then all workload is diverted to the reserved cluster. Typically this would be for DR.

In blue/green, the cutover is phased. Again, all workload is sent initially to one cluster. Then, some workload is diverted to the second cluster, but some remains on the first cluster. Typically this is done to validate the new cluster, for example, after a code release, before committing all workload to it.


Red/black is handled by failover configuration. Blue/green also need access control.

Automatic failover for Disaster Recovery

For all cutover models, clients are configured with a list of Hazelcast clusters to use.

On start-up, a client will connect to the first cluster in the list.

If that cluster goes offline (red/black), the client will automatically reconnect to the next cluster in the list.

Equally, if the cluster rejects the client (blue/green), the client again will automatically connect to the next cluster in the list.

Diverting clients

Blue/green access control forces selected clients to be disconnected from a cluster.

The Management Center or REST API allows you to specify access control lists for each cluster.

Clients can be given a name, one or more labels, and be identified by their IP address. These can be used as a selector to shunt a client from a cluster.

Management Center showing connected clients
4 Connected clients

Here we might use a specific label “Szyslak” to select a client to move.


It would be wise to assume your data center or your cloud provider’s data center may be impacted by a catastrophic event.

If you configure your applications with the locations of two clusters, they will use the first but divert automatically to the second when disaster strikes.

Or you can instruct some or all applications to reconnect from the first to the second cluster if you wish it to happen on demand.

Logging is a key tool for the application developer, with many considerations such as what detail to log, logging level, etc. Here we’ll consider where to log and use Hazelcast as a log destination for Slfj4, the standard logging framework for Java. Specifically, we’ll use an IMap as our log store, and store it on the cloud. An IMap makes it easier to work with than logging to the filesystem, as there might be many host machines involved in your application, so log collation is solved by hosting the logs in Hazelcast.

Why an IMap ?

An IMap is essentially a random access key value store. Key hashing means that consecutive keys are probably not stored adjacent to each other, allowing for easy rebalancing as data grows. What matters is how we query the log store. Most likely we will be doing range scans, all the logs for the last few minutes perhaps. This would require a sort since the data is not stored sorted. So, by using an IMap we have gained storage scalability at the expense of query speed, which seems a reasonable trade-off.

The log structure

For querying, we define the IMap content as:

        String mapping = "CREATE OR REPLACE MAPPING \"" + logMap.getName() + "\""
                + " ("
                + "    \"socketAddress\" VARCHAR EXTERNAL NAME \"__key.socketAddress\","
                + "    \"timestamp\" BIGINT EXTERNAL NAME \"__key.timestamp\","
                + "    \"level\" VARCHAR EXTERNAL NAME \"this.level\","
                + "    \"message\" VARCHAR EXTERNAL NAME \"this.message\","
                + "    \"threadName\" VARCHAR EXTERNAL NAME \"this.threadName\","
                + "    \"loggerName\" VARCHAR EXTERNAL NAME \"this.loggerName\""
                + " )"
                + " TYPE IMap "
                + " OPTIONS ( "
                + " 'keyFormat' = 'json-flat',"
                + " 'valueFormat' = 'json-flat'"
                + " )";

The key is JSON composed from the IP address of the process that writes to the log, plus a timestamp. The value is also JSON, the more familiar logging fields of log level, message and so on. This blog post is just demonstrating the concept. For real use we’d need to expand it to distinguish between multiple processes on the same host, and perhaps using a JSON Array to capture multiple messages produced with the same timestamp.

Show Me the Code

Our application code does this:

       HazelcastInstance hazelcastClient = HazelcastClient.newHazelcastClient(clientConfig);
       org.slf4j.Logger logger = IMapLoggerFactory.getLogger(Application.class);
       logger.info("hello world");

We create a connection to a Hazelcast cluster using some client configuration. This may be to Hazelcast Viridian or an already active cluster. Then we obtain a SLF4J Logger that uses Hazelcast as a destination. Then we log a message and shutdown. Crucially, the rest of the application may have nothing to do with Hazelcast. What we’re using the Hazelcast client for here is to connect to the logging store (the Hazelcast cluster), so as a minimum it’s a write-only client.

Show Me the Code in Detail

Two classes do the work.


IMapLoggerFactory returns IMapLogger instances, implementing the ILoggerFactory interface. It looks like:

public class IMapLoggerFactory implements ILoggerFactory {

    private static IMap logMap;
    private static String socketAddress;
    private static Level level = Level.INFO;

    public static synchronized Logger getLogger(Class klass) {
        if (logMap == null) {
            Iterator serverIterator = Hazelcast.getAllHazelcastInstances().iterator();
            Iterator clientIterator = HazelcastClient.getAllHazelcastClients().iterator();

            HazelcastInstance hazelcastInstance = serverIterator.hasNext() ? serverIterator.next() : clientIterator.next();

            socketAddress = hazelcastInstance.getLocalEndpoint().getSocketAddress().toString();
            logMap = hazelcastInstance.getMap("log");
        return new IMapLogger(klass.getName(), logMap, socketAddress, level);

    public static void setLevel(Level arg0) {
        level = arg0;

    public Logger getLogger(String name) {
        return LoggerFactory.getLogger(name);


Most of the work is in finding the current Hazelcast instance.

We use Hazelcast.getAllHazelcastInstances() to find all server instances in this JVM, and HazelcastClient.getAllHazelcastClients( to find all clients.

We assume this JVM has exactly one Hazelcast instance. If this isn’t going to be true, you’d need to adjust the implementation.

From that Hazelcast instance, we create a logger passing it the IMap to log to.


IMapLogger implements the Logger interface, which has more than 60 methods.

public class IMapLogger implements Logger {

    private final String name;
    private final IMap logMap;
    private final String socketAddress;
    private final Level level;

    public IMapLogger(String arg0, IMap arg1, String arg2, Level arg3) {
        this.name = arg0;
        this.logMap = arg1;
        this.socketAddress = arg2;
        this.level = arg3;

    private void saveToHazelcast(Level lvl, String msg) {
        StringBuffer keyStringBuffer = new StringBuffer();
        keyStringBuffer.append(" \"socketAddress\" : \"" + this.socketAddress + "\"");
        keyStringBuffer.append(", \"timestamp\" : " + System.currentTimeMillis());

        StringBuffer valueStringBuffer = new StringBuffer();
        valueStringBuffer.append(" \"loggerName\" : \"" + this.name + "\"");
        valueStringBuffer.append(", \"threadName\" : \""
                + Thread.currentThread().getName() + "\"");
        valueStringBuffer.append(", \"level\" : \"" + lvl + "\"");
        valueStringBuffer.append(", \"message\" : \"" + msg + "\"");

        this.logMap.set(new HazelcastJsonValue(keyStringBuffer.toString()),
                new HazelcastJsonValue(valueStringBuffer.toString()));

The main method added is saveToHazelcast. This turns the logging detail into a JSON key and a JSON value to insert into the logging IMap.

For “INFO” logging, we have this:

    public boolean isInfoEnabled() {
        return this.level.toInt() <= Level.INFO.toInt();

    public void info(String msg) {
        if (this.isInfoEnabled()) {
            this.saveToHazelcast(Level.INFO, msg);

So when we do logger.info(“hello world”) it invokes our method if INFO logging is on. The methods for WARN, DEBUG etc logging are omitted for brevity. They follow a similar pattern.


From the Management Center or a client, we can query our logs.

SQL query from Management Center


So there you have it, logs from wherever can be saved to the cloud for later inspection.

If you’re interested in a fuller example, see here.

To avoid running out of space, you might want to configure the IMap for eviction and/or expiry.


You should absolutely consider security here. Confidential data may appear in logs, and you don’t want accidental or malicious connections.

We are proud to announce the beta launch of Hazelcast Viridian Serverless, our fully managed, scale on-demand, real-time data platform, available for beta registration

In our increasingly interconnected world today, organizations and developers are migrating their workloads onto the cloud to take advantage of cloud services that are more scalable, more available, and more agile than ever before. Hazelcast enables these organizations to immediately act on their real-time data in the cloud to respond faster to customer needs and capture untapped business value. With Hazelcast Viridian Serverless, we want to take the next evolutionary leap in granting developers unfettered access to real-time data capabilities without the operational complexity.

Serverless Architecture

How do we make real-time data platforms simpler to operate? More scalable? More agile? One approach we can take is serverless architecture, a way of building applications without needing to think about the underlying infrastructure i.e. servers. In Hazelcast Viridian Serverless, you just select your cloud provider region, create your serverless cluster, and start building your application. You don’t need to manage servers, virtual machines, or containers. Of course, there are still servers working behind the scenes to run your application, but we manage them for you instead. That means we ensure the service is always available and the infrastructure properly secured, patched, and upgraded. 

We will provision and dynamically scale the cluster based on your workload, eliminating the need for up-front sizing planning, and you only pay for what you use instead of paying for entire servers. The more storage you use, the more we automatically allocate, and – again – you only pay for what you’ve used. Conversely, the less you use, the less you pay. 

But wait, there’s more! Hazelcast is providing users with a limited-time offer of up to 2 Gibibytes (GiB) of data storage to celebrate the beta launch. 

By eliminating the need to manage infrastructure and reducing operational complexity, you can go from sign-up to deployment of a fully functional cluster in seconds. 

Built By Developers, For Developers

Hazelcast Viridian Serverless is designed with developers in mind. With no infrastructure to set up, you can rapidly iterate, experiment with your code, and build solutions for your business. For even faster development, you can create a development cluster with only essential features enabled instead of the standard production cluster, allowing you to speed through a cluster start/stop cycle in a few seconds. Additionally, Hazelcast Viridian Serverless has a built-in developer learning center with information, videos, and tutorials on topics such as connecting to other cloud data sources and building sample solutions like a write-through cache.

Because the Hazelcast Platform powers the Hazelcast Viridian cloud portfolio, developers have access to many features, including:

  • Familiar, declarative API for building applications that leverage real-time streaming data
  • Support for streaming SQL to enable a large base of developers to run queries on real-time data 
  • Out-of-the-box connectors to multiple cloud data sources and an API for building custom connectors to any other data source
  • Seamless integration with cloud deployments of popular data technologies, such as Apache Kafka
  • Robust WAN replication capabilities that enable data integration across cloud deployments to support geo-distributed systems and disaster recovery strategies

When Should You Use Serverless Architecture?

You should consider serverless architecture in any of the following settings:

  • You have unpredictable workload sizes – Serverless can scale up and down automatically and react in real-time to workload changes. By using a serverless cluster, you won’t have to worry about unpredictable spikes in demand. 
  • You want to only pay for the resources you’ve consumed – Serverless consumption pricing means that you will never overpay for more resources than you need. 
  • You do not want to manage your cloud infrastructure – By leaving the infrastructure management to us, you won’t need to worry about server breakdowns, upgrades, or any other operational tasks that will take precious development time from your team. 
  • You want to build a prototype or Proof-of-Concept very quickly – You can spin up a serverless cluster and connect a sample client in a few minutes without needing a credit card. You also have open access to enterprise features ready for experimentation or evaluation. 
  • You want to future-proof your application – Serverless will grow with your application; the more workloads you need, the more we will allocate. Additionally, we will ensure your Serverless infrastructure is always up-to-date, so there is no need to retrofit features or migrate. 

Register for Hazelcast Viridian Serverless Beta Today

We started this journey with Hazelcast Viridian Serverless Beta because we wanted to make Hazelcast accessible to all developers and democratize the power of real-time data. From this point on, we will continue to listen closely to user feedback and add features like additional connectors and SQL builder as we move towards our GA release. 

You can register for Hazelcast Viridian Serverless Beta and start building your real-time app or service for free; No credit card is necessary. 

Hazelcast has launched the next chapter in its journey to make real-time capabilities even easier for customers to deploy, with its new cloud-managed service offering, the Hazelcast Viridian serverless data platform. This new service simplifies IT deployments more so than ever by eliminating the up-front planning for sizing and configuration (i.e., no servers to think about) that is otherwise required for… well, “serverful” architectures.

Our Approach

This new release is another step in helping Hazelcast customers more easily achieve their important technology and business goals that are driven by data. And you can think of Hazelcast as having a two-pronged approach to helping their customers. As alluded to above, simplifying the deployment effort is one part, and the other important part is providing an innovative real-time data platform that adds capabilities that other technologies were not built to handle.

The aforementioned second part is important because businesses today are turning to their IT teams more than ever to add greater competitive advantage via digital transformation. One critical objective in such initiatives entails the delivery of real-time capabilities to respond faster to customers and capitalize on business opportunities. But with so many IT systems built around batch-oriented technologies, companies struggle to gain real-time capabilities with their existing IT infrastructure. This should not be surprising, because after all, most systems were not built to keep up with the fast data workloads that are inherent in real-time operations. That is why they are run in batch – to wait for a time, like at the end of the day, when data creation is temporarily paused which reduces the overall workload so the collected data can be processed.

“Real-Time” is Hard with the Wrong Technologies

In the past, businesses that pursued real-time capabilities with existing infrastructure ended up with significant compromises. One is reduced business efficiency, in which companies spend more on human and technology resources than expected to try to address the bottlenecks that cause slow responsiveness. Another is the deliberate dismissal of innovation—that is, setting a lower bar for business capabilities, because of the perceived costs and effort to deliver on grander visions.

Businesses are realizing that the implementation of technologies that were specifically built to handle real-time responsiveness can help them deliver on their real-time aspirations. That’s where Hazelcast fits in. With the Hazelcast Platform, businesses can deploy high-powered business applications that process data in real time and take action as soon as the data is created to get a jump on fleeting opportunities as they arise. This is accomplished by combining a stream processing engine that acts on real-time data in motion, with an tiered in-memory data store that provides extremely fast access to data.

Big Trends to Consider

Companies cannot pursue real-time operations without also considering two other long-running, data-focused trends. The first is the migration of on-premises systems onto the cloud. This trend is evidenced by the ongoing rapid growth of cloud vendors, as more companies take advantage of the agility and flexibility of public cloud deployments. Alongside that trend is the growth of cloud-managed services, as more businesses are looking to further simplify the management of their IT infrastructure. Businesses see a huge benefit from a cloud-managed service model since it allows putting more resources into creating business value and less on maintaining infrastructure.

Sign Up for Beta Access Today

The Hazelcast Viridian serverless data platform includes the power of the Hazelcast Platform as the foundational technology in the cloud-managed service, along with the simplified implementation model of a serverless deployment. You now get a technology built for adding real-time capabilities, available as a cloud-managed service, in an easy-to-get-started deployment. For a closer look, sign up here to participate in the beta implementation and watch this video on getting started with running applications on this service. Serverless is great for all types of deployments (test/dev/prod), and you can get started for free with a 2 GiB allocation of data storage. Licensing is billed per hour per GiB, but you can get a discount on an annual commitment

That’s all for now and stay tuned as we continue expanding our cloud capabilities to better enable real-time businesses.

We are pleased to announce the following recent releases below. At the end of the list, you will find the Command-Line Client, an exciting new addition to our product portfolio! ♥ Please don’t hesitate to join the Hazelcast Community on Slack to raise your questions and provide feedback. Also, feel free to check our community page for other options. Happy Hazelcasting!

Hazelcast Platform 5.1.2

This release is the second patch release of Hazelcast Platform v5.1. 


Hazelcast Platform Operator 5.3

Hazelcast Platform Operator automates common management tasks such as configuring, creating, scaling, and recovering Hazelcast clusters on Kubernetes and Red Hat OpenShift. By taking care of manual deployment and life-cycle management, Hazelcast Platform Operator makes it simpler to work with Hazelcast clusters.

New Features

  • Added the Hazelcast map CR and the support for resource limit and request for Platform and Management Center CRs.
  • Added agent support for external Backup and Restore.


  • Added the following statuses: Hot backup, restore, client connection message
  • Implemented the map persistence using Hazelcast ConfigMap.
  • Refactored the logger usage and removed unnecessary logging entries.


Hazelcast Cloud 3.10.0

This release of Hazelcast Cloud fixes mostly UI issues.


Hazelcast .NET Client 5.1.1 and 5.1

New Features

  • Added Blue-green (failover) deployments.
  • Added the support for Hazelcast’s FencedLock data structure.


  • The client now can work with the .NET 6.0 framework.


Hazelcast Go Client 1.3.0

New Features


Hazelcast Command-Line Client 1.0 (Beta-1)

This is the very first release of Hazelcast Command-Line Client (CLC) – a command-line tool to connect to and operate on Hazelcast Platform and Hazelcast Cloud.