Blog ›Controlled Partitioning

By Peter Veentjer

Aug 25, 2013

Controlled Partitioning

One of the cool features of the soon to be released Hazelcast 3.1 version, is the ability to control the partition of certain DistributedObjects like the IAtomicLong or ILock. This makes it easy to collocate data and this will improve performance and scalability since there will be a stronger locality of reference. Let’s start with a basic example. Imagine there are 2 counters:

IAtomicLong counter1 = hz.getAtomicLong("counter1");
IAtomicLong counter2 = hz.getAtomicLong("counter2");

If these 2 counters are always used in some operation, it is very likely that there will be unwanted remoting. The cause is that Hazelcast uses the name not only for identification, but also uses it to determine the partition. So these 2 counters will probably end up in different partitions. To resolve this problem, Hazelcast 3.1 makes it possible to add the partition-key as part of the name:

IAtomicLong counter1 = hz.getAtomicLong("counter1@hazelcast");
IAtomicLong counter2 = hz.getAtomicLong("counter2@hazelcast");

By adding a ‘@partition-key’, Hazelcast knows that it should determine the partition based on the ‘partition-key’ section. If you have configured a DistributedObject in the Hazelcast configuration, Hazelcast uses the base-name of the object, so without the @partition-key section, to find the configuration. So you don’t need to change your configuration to make use of @partition-key. It is important to understand that the name of a DistributedObject includes the @partition-key section:

IAtomicLong counter1 = hz.getAtomicLong("counter1@hazelcast");
IAtomicLong counter2 = hz.getAtomicLong("counter1");

Counter1 is a different counter than counter2 even though their base-names are the same. In the previous example the ‘hazelcast’ partition-key is used. But you are completely free to use any value you see fit, although you need to make sure that the partition-keys are equally spread among the partitions. If you don’t care about the actual value, you can use the ‘randomPartitionKey’ method to generate a partition-key for you:

String partitionKey = hz.getPartitionService().randomPartitionKey();

Other good candidates are a random value, e.g. using the Random class. Or use an UUID, although it will cause more overhead due to its length. In some cases you can’t create all DistributedObjects up front. Sometimes you only have a reference to a DistributedObject, but you want to create a new DistributedObject in the same partition. This is possible using the ‘getPartitionKey’ method:

String partitionKey = counter1.getPartitionKey();
IAtomicLong counter3 = hz.getAtomicLong("counter3@"+partitionKey);

As you can see, counter3 will use the same partitionKey as counter1 and therefor it will end up in the same partition. In practice you want to send the function to be executed to the partition where the data is stored. Otherwise your data is in the same partition, but you will still be doing remote calls to access it. To send the function to the right partition, you might want to have a look at the executeOnKeyOwner method of the IExecutorService:

executor.executeOnKeyOwner(someTask,"hazelcast");

In this example the ‘someTask’ will be executed on the partition for partition-key ‘hazelcast’. Also the IMap can be used in combination with controlled partitioning. Normally the key is used to determine the actual partition, but in some cases this is undesirable because you want different keys to be stored in the same partition. This can be done by letting the key implementing the PartitionAware interface, example:

public class PartitionAwareKey 
        implements PartitionAware, DataSerializable {
    public final String key;
    public final String partitionKey;

    public PartitionAwareKey(String key, String partitionKey) {
        this.key = key;
        this.partitionKey = partitionKey;
    }

    @Override
    public String getPartitionKey() {
        return partitionKey;
    }
    ...
}

Now data can be inserted like this:

IMap map = hz.getMap("map");
map.put(new PartitionAwareKey("a","hazelcast"),"foo");
map.put(new PartitionAwareKey("b","hazelcast"),"foo");

Now both the map entries will be placed in the partition of partition-key ‘hazelcast’. We are thinking about making the partitioning schema for the map more flexible using a injectable partitioning strategy so can say:

map.put("a@hazelcast","foo");

More about this in the near future. With the new @partition-key syntax, you now have full control on the partition where a DistributedObject is stored. My experience is that figuring out a correct partitioning schema is one of the most important choices that needs to be made if you want a high performance and scalable system. So make sure you deal with finding a correct partitioning schema as soon as possible.

Keep Reading

Blog

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Network hops cost milliseconds; milliseconds cost money. Put data, compute, and AI on one platform, and both bills shrink, whether…

Blog

Understanding the Value of Distributed Compute

Introduction Hazelcast is a powerful platform. It delivers the power of a highly reliable, distributed cache. Equally important is the…

Blog

Resilience That Holds Under Load: Hazelcast Platform 5.7

A major release for institutions where the operational state must remain correct during degradation, not just be restored afterward. The…

Blog

Testing distributed resilient applications powered by Hazelcast

Applications powered by Hazelcast and that use it to drive business logic need tests that go beyond happy-path validation. Serialization,…

Datasheet

/ PDF

/ 2 pages

Resilient, Continuous, Active Data – without Compromise Datasheet

The unified in-memory and stream processing platform for resilient, continuous active data at sub-millisecond speed.

Webinar

/ Video

/ 45 min

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Zero-downtime upgrades aren’t the hard part—schema evolution is. Learn how mixed service versions interact with shared cached data, why subtle inconsistencies cause failures, and how to design forward-compatible changes using Hazelcast and real Java examples.

Platform

Cloud Deployment Options

Key Solutions

By Industry

By Use Case

By Architecture

A cloud-agnostic architecture for your applications

Resource Center

Content Types

Learn

33% Reduction in Operational Costs

Developers

Community

Learn

Toolbox

A cloud-agnostic architecture for your applications

By Peter Veentjer

Controlled Partitioning

Further Reading

Keep Reading

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Understanding the Value of Distributed Compute

Resilience That Holds Under Load: Hazelcast Platform 5.7

Testing distributed resilient applications powered by Hazelcast

Resilient, Continuous, Active Data – without Compromise Datasheet

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect

Platform

Cloud Deployment Options

Key Solutions

By Industry

By Use Case

By Architecture

A cloud-agnostic architecture for your applications

Resource Center

Content Types

Learn

33% Reduction in Operational Costs

Developers

Community

Learn

Toolbox

A cloud-agnostic architecture for your applications

By Peter Veentjer

Spread the Word

Controlled Partitioning

Further Reading

Keep Reading

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Understanding the Value of Distributed Compute

Resilience That Holds Under Load: Hazelcast Platform 5.7

Testing distributed resilient applications powered by Hazelcast

Resilient, Continuous, Active Data – without Compromise Datasheet

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect