Introducing Pipelining in Hazelcast IMDG 3.12

Hazelcast IMDG 3.12 contains a new performance optimization called pipelining.

If you have a client/server with a 1ms round trip time, then a single thread will only be able to make 1/001=1000 operations per second. The most obvious way to improve performance is to add a second thread because with two threads you can do 2*(2/001)=2000 operations/second. But it isn’t always possible or convenient to introduce multiple threads and sometimes even multiple threads will not be able to reach optimal throughput.

The critical part for increased performance is that you have multiple inflight operations. This cannot only be done with parallel threads, but this can also be realized with asynchronous calls like IMap.getAsync/putAsync or many of the other asynchronous calls the HZ API exposes. If a single thread would have two asynchronous calls inflight at any given moment, then the throughput would also be 2000 operations/second. And with ten asynchronous calls, it could in theory increase to 10,000 operations/second!

So by increasing the depth of the pipeline, we can get better throughput. This is where the new pipelining API comes into the picture. An example:

Pipelining<String> pipelining = new Pipelining<String>(10);
for (long k = 0; k < 100; k++) {
int key = random.nextInt(keyDomain);
pipelining.add(map.getAsync(key));
}
// wait for completion
List<String> results = pipelining.results();

In the above example, the pipeline depth is 10 and we are going to execute 100 asynchronous requests using this pipeline and wait for the result. Let’s look at a more comprehensive example and see what kind of performance improvement we can get.

import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;
import com.hazelcast.core.IMap;
import com.hazelcast.core.Pipelining;

import java.util.List;
import java.util.Random;

public class PipeliningDemo {

private HazelcastInstance member;
private HazelcastInstance client;
private IMap<Integer, String> map;
private int keyDomain = 100000;
private int iterations = 500;
private int getsPerIteration = 1000;

public static void main(String[] args) throws Exception {
PipeliningDemo main = new PipeliningDemo();
main.init();
main.pipeling(5);
main.pipeling(10);
main.pipeling(100);
main.nonPipelined();
System.exit(0);
}

private void nonPipelined() {
System.out.println("Starting non pipeling");
long startMs = System.currentTimeMillis();
Random random = new Random();
for (int i = 0; i < iterations; i++) {
for (long k = 0; k < getsPerIteration; k++) {
int key = random.nextInt(keyDomain);
map.get(key);
}
}
long duration = System.currentTimeMillis();
System.out.println("Non pipeling duration:" 
+ (duration - startMs) + " ms");
}

private void pipeling(int depth) throws Exception {
System.out.println("Starting pipeling with depth:" + depth);
long startMs = System.currentTimeMillis();
Random random = new Random();
for (int i = 0; i < iterations; i++) {
Pipelining<String> pipelining = new Pipelining<String>(depth);
for (long k = 0; k < getsPerIteration; k++) {
int key = random.nextInt(keyDomain);
pipelining.add(map.getAsync(key));
}

List<String> results = pipelining.results();
if (results.size() != getsPerIteration) {
throw new RuntimeException();
}
}
long duration = System.currentTimeMillis();
System.out.println("Pipelined with depth:" + depth 
+ ", duration:" + (duration - startMs) + " ms");
}

private void init() {
member = Hazelcast.newHazelcastInstance();
client = HazelcastClient.newHazelcastClient();
map = client.getMap("map");

for (long l = 0; l < keyDomain; l++) {
member.getMap(map.getName()).put(l, "" + l);
}
}
}

When we run the above demo, I get the following outcome on my Desktop (AMD Ryzen 1800X):

Starting pipeling with depth:5
Pipelined with depth:5, duration:6458 ms
Starting pipeling with depth:10
Pipelined with depth:10, duration:3738 ms
Starting pipeling with depth:100
Pipelined with depth:100, duration:1127 ms
Starting non pipeling
Non pipeling duration:28525 ms

This means that pipelining with depth five is 340% faster compared to non-pipelining. With a pipeline depth of 10, pipelining is 663% faster compared to non-pipelining. And with a pipeline depth of 100, it is 2431% faster compared to non-pipelining!

If you want to play with the above demo, check it out in our GitHub repository.