NoSQL Data Store
In dynamically scalable partitioned storage systems, whether it is a NoSQL database, filesystem or in-memory data grid, changes in the cluster (adding or removing a node) can lead to big data moves in the network to re-balance the cluster. Re-balancing will be needed for both primary and backup data on those nodes. If a node crashes for example, the dead node’s data has to be re-owned (become primary) by other node(s) and also its backup has to be taken immediately to be fail-safe again. Shuffling megabytes of data around has a negative effect in the cluster as it consumes your valuable resources such as network, CPU and RAM. It might also lead to higher latency of your operations during that period.
JRebel Labs annual Java Developer Survey 2014 shows Hazelcast as the up-and-coming NoSQL solution beating Riak, Memcached, and a couple dozen others.
Hazelcast’s NoSQL Data Store focuses on latency and makes it easier to cache/share/operate terabytes of data in-memory. Storing terabytes of data in-memory is not a problem but avoiding garbage collection to achieve predictable, low latency and being resilient to crashes are big challenges. By default, Hazelcast stores your distributed data (map entries, queue items) into Java heap which is subject to garbage collection. As your heap gets bigger, garbage collection might cause your application to pause tens of seconds, badly effecting your application performance and response times. High-Density Memory Store is Hazelcast with native memory storage to avoid garbage collection pauses. Even if you have terabytes of cache in-memory with lots of updates, garbage collection will have almost no effect; resulting in more predictable latency and throughput.
Elastic Memory implementation uses NIO DirectByteBuffers and doesn’t require any defragmentation. Here is how things work: User defines the number of gigabyte storage to have in native memory per Java Virtual Machine, let’s say it is 40GB. Hazelcast will create 40 DirectBuffers, each with 1GB capacity. If you have, say 100 nodes, then you have total of 4TB native memory storage capacity. Each buffer is divided into configurable chunks (blocks) (default chunk-size is 1KB). Hazelcast uses a queue of available (writable) blocks. 3KB value, for example, will be stored into 3 blocks. When the value is removed, these blocks are returned back into the available blocks queue so that they can be reused to store another value.
With new backup implementation, data owned by a node is divided into chunks and evenly backed up by all the other nodes. In other words, every node takes equal responsibility to backup every other node. This leads to better memory usage and less influence in the cluster when you add/remove nodes.