Hazelcast with Hundreds of Nodes

by Enes Akar — Feb 13, 2014

One of the frequent question about hazelcast:

Will it scale up to hundreds of nodes?

Answer from Talip (founder of Hazelcast) in mail group:

Forming 100+ node cluster is not hard and it will be as stable as your networking infrastructure. Running distributed executions is fine too as long as your application can handle execution failures and you don’t make a distributed call within another distributed call. 

To get less affected by the split-brain issues, smaller clusters are recommended. Try to run less number of execution nodes (cluster members) and submit your executions from any number of clients. Clients can scale to thousands easily.

Out of Memory and Hazelcast

by Enes Akar — Dec 30, 2013

OOM is a real trouble for distributed environment. A node having trouble may pull down whole cluster’s performance.

Hazelcast’ general policy on OOM condition is trying to isolate the problem and shutdown instance as soon as possible.

Here as you can see below, it will first try close connections, stopping threads and shutting down JVM instances. Logging the error is the last thing; so you may not see the OOM log.

This is the default action. Closing connections is the first thing being tried. Otherwise other nodes can still consider the problematic node alive till operations’ timeout. 

You can have your own action executed when OOM. You can do some cleanup, log error, call System.exit() etc.

Here an example to define a custom OOM handler:

Problem: Hazelcast can detect OOM, when hazelcast threads experience the problem. So if OOME is thrown by an external thread; it can not be handled by hazelcast.

For Faster Hazelcast Queries

by Enes Akar — Dec 27, 2013

Here some tips on improving hazelcast query performance:

1- Add indexes for queried fields. For queries with ranges (gt, lt, between) you can use ordered index.

IMap imap = Hazelcast.getMap("employees");
// ordered, since we have ranged queries for this field
imap.addIndex("age", true); 
// not ordered, because boolean field cannot have range
imap.addIndex("active", false); 

2- Set optimizeQuery = true, in map config. By enabling this flag, hazelcast will cache a deserialized (object) form, and use it when necessary. Disadvantage of this approach is memory overhead of cached object.

MapConfig mapConfig = config.getMapConfig(“mapName”);
mapConfig.setOptimizeQueries(true);

3- Object “in memory format”: By default hazelcast stores objects in their serialized form. This is an optimization: an extra serialisation step is skipped for remote operations. But if the majority of a map’s operations is query; then you can consider changing in-memory-format to object. With object format, queries will be run on directly objects, getting rid of de-serialisation. Disadvantage of this approach: The remote get operations from this map will have a serialisation step. 

MapConfig mapConfig = config.getMapConfig(“mapName”);
mapConfig.setInMemoryFormat(InMemoryFormat.OBJECT);

Note: If you use 3rd approach (in-memory-format=OBJECT) then no need for optimizeQuery setting (2nd one) because objects will be already stored in deserialized.