An Easy Performance Improvement with EntryProcessor
A lot of a developer’s work is about transforming and aggregating data:
- Increasing the quantity of a product in a shopping cart
- Applying VAT on the price of a product
- Computing the price of a shopping cart
- Etc…
Sometimes, one needs the features of a full-fledged stream processing engine, such as Hazelcast Jet, sometimes not. For this post, let’s use Hazelcast IMDG to store the content of an e-commerce cart, just like we did in our recent blog post.
Straightforward Implementation
Specifically, when the user increases the quantity, we need to update the value contained in the relevant IMap
. The implementation code looks like the following:
IMap<Long, Integer> cart = hazelcast.getMap("cart");
Integer quantity = cart.get(productId);
Integer newQuantity = quantity + 1;
cart.set(productId, newQuantity);
This snippet:
- Gets the data from the
IMap
- Updates it
- And puts the new value in the
IMap
Yet, there’s one issue. The API abstracts away the nitty-gritty details of the nature of Hazelcast IMDG: it’s distributed. When two or more Hazelcast nodes are started, they will by default broadcast to find each other, and form a cluster. Among other things, that allows data to be duplicated and backed up on more than one node. Thus, if one node fails, data is still available on another one. In the above example, nothing is said about the number of underlying nodes.
Now, imagine the above code runs on a specific node A. The cart is stored on another one B. Execution requires 2 network round trips:
- One to move the data from B to A
- Another to put it back from A to B
This is highly inefficient, especially in the context of high-performance requirements scenarios, such as e-commerce.
EntryProcessor to the Rescue
Enter the EntryProcessor
interface. From the JavaDoc:
An EntryProcessor passes you a Map.Entry. At the time you receive it the entry is locked and not released until the EntryProcessor completes. This obviates the need to explicitly lock as would be required with a ExecutorService.
Performance can be very high as the data is not moved off the Member partition. This avoids network cost and, if the storage format is InMemoryFormat.OBJECT, then there is no de-serialization or serialization cost.
EntryProcessors execute on the partition thread in a member.
Obviously, this is the solution. Migrating from the above code to an EntryProcessor
is very straightforward:
IMap<Long, Integer> cart = hazelcast.getMap("default");
cart.executeOnKey(productId, new AbstractEntryProcessor<Long, Integer>() {
@Override
public Object process(Map.Entry<Long, Integer> entry) {
Integer quantity = entry.getValue();
Integer newQuantity = quantity + 1;
entry.setValue(newQuantity);
return null;
}
});
Note that the entry’s value needs to be set in all cases (even if the entry’s value is mutable). The reason is that Hazelcast must be aware that the value has changed in order to propagate the new value to other backup nodes if necessary.
Finally, the entry processor can return any arbitrary object. In our use case, we can return the new quantity. This can be used to display it for example:
IMap<Long, Integer> cart = hazelcast.getMap("default");
Integer newQuantity = (Integer) cart.executeOnKey(productId, new AbstractEntryProcessor<Long, Integer>() {
@Override
public Object process(Map.Entry<Long, Integer> entry) {
Integer quantity = entry.getValue();
Integer newQuantity = quantity + 1;
entry.setValue(newQuantity);
return newQuantity;
}
});
Conclusion
In this focused post, we learned about the EntryProcessor
interface. Instead of getting the data locally and putting it back, it allows sending the computation on the cluster to be executed on the required node(s). It’s easy to set up, there’s no reason not to use it! If you want an instant performance improvement, you should consider using it.