Hazelcast Platform 5.2 – Show Me the Code!

This blog is written for those familiar with Hazelcast and prefer to look at code snippets to get started quickly. For a list of new features in Hazelcast Platform 5.2, I recommend starting with this blog.

Compact Serialization

Historically, Hazelcast has supported multiple serialization schemes (serializable, portable etc.) and Hazelcast is happy to announce that compact serialization is now GA and the recommended object serialization mechanism as it offers several advantages.

Compact serialization separates the schema from the data and stores it per type, not per object which results in less memory and bandwidth usage as compared to other formats. It also supports schema evolution, and partial deserialization of fields without deserializing the whole object during queries or indexing.

Let’s look at how we would use compact serialization for an “Employee” class. 

public class Employee {
    private long id;
    private String name;
    public Employee(long id, String name) {
        this.id = id;
        this.name = name;
    }

    public long getId() {
        return id;
    }

    public String getName() {
        return name;
    }
}

As you might’ve noticed already, compact serialization does not require any changes in the user class as it doesn’t need the class to implement any interface!  Hazelcast will automatically extract the schema out of the classes and Java records using reflection, cache and reuse it for the subsequent serialization and deserialization, at no extra cost.

The underlying format of the Compact serialized objects is platform and language independent.

For more details, please check out our latest documentation on compact serialization.

JDBC SQL Connector

Instead of programmatically configuring an external data store as a JDBC source / sink, it is now possible to define the JDBC connector declaratively using XML / YAML as shown below. This connector is currently in beta.

external-data-store:
  mysql-database: 
    class-name: com.hazelcast.datastore.JdbcDataStoreFactory
    properties:
      jdbcUrl: jdbc:mysql://dummy:3306
      username: xyz
      password: xyz
    shared: true


To query these external data stores, you can simply create a mapping with the JDBC SQL connector. In the example below, we are creating a mapping to a JDBC connector that references the mysql-database as the external data store.

CREATE MAPPING people
TYPE JDBC 
OPTIONS (
  'externalDataStoreRef'='mysql-database' 
)

Zero-Code Connector (MapStore / MapLoader)

Many of you are already familiar with using MapStore & MapLoader with Hazelcast. MapStore configuration can now support handling operations asynchronously to avoid blocking partition threads. Platform 5.2 introduces a low-code approach with generic MapStore which requires little or no Java code. The generic MapStore is a pre-built implementation that connects to an external data store, using the external data store configuration. 

This connector is currently in beta and supports AWS RDS for MySQL and PostgreSQL.

hazelcast:
  map:
    default:
      map-store:
        enabled: true
        class-name: com.hazelcast.mapstore.GenericMapStore
        properties:
            external-data-store-ref: my-mysql-database
            table-name: test

Streaming SQL 

In Platform 5.1, we added SQL support for aggregating streams into tumbling and hopping window.  This provided the ability to run functions and aggregations such as sum, count etc. snapped to a time window. 

We’ve now added support to combine multiple data streams. In the example below, we are combining the data from shipments and orders topic in Kafka.

We start by creating a mapping to the Kafka topic e.g., shipments.

CREATE OR REPLACE MAPPING shipments(
  id VARCHAR,
  ship_ts TIMESTAMP WITH TIME ZONE,
  order_id INT,
  warehouse VARCHAR
)
TYPE Kafka
OPTIONS (
  'keyFormat' = 'varchar',
  'valueFormat' = 'json-flat',
  'auto.offset.reset' = 'earliest',
  'bootstrap.servers' = 'broker:9092')

Then, we create a query view which requests all events from shipments Kafka topic, and allows a 1-minute lag for incoming events.

CREATE OR REPLACE VIEW shipments_ordered AS
 SELECT * FROM TABLE(IMPOSE_ORDER(
  TABLE shipments,
  DESCRIPTOR(ship_ts), 
  INTERVAL '1' MINUTE));

CREATE OR REPLACE VIEW orders_ordered AS
  SELECT * FROM TABLE(IMPOSE_ORDER(
  TABLE orders,
  DESCRIPTOR(order_ts), 
  INTERVAL '1' MINUTE))

Finally, we join the events from orders_ordered and shipments_ordered streams to get all shipped orders within 7-days time window in real-time

SELECT o.id AS order_id,
  o.order_ts,
  o.total_amount,
  o.customer_name,
  s.id AS shipment_id,
  s.ship_ts,
  s.warehouse
FROM orders_ordered o JOIN shipments_ordered s 

ON o.id = s.order_id AND s.ship_ts BETWEEN o.order_ts AND o.order_ts + INTERVAL '7' DAYS;

More details on this example can be found in our joining multiple streams tutorial.

You can try all these SQL examples using the SQL Browser in the latest Hazelcast Management Center. 

Tiered Storage

Tiered Storage offers the opportunity to store datasets in Hazelcast that are much larger than the available memory.  For example, in the past, you would need to configure an eviction policy to remove least frequently used (LFU) or least recently used (LRU) data from the memory store when limits approach.  Now with Tiered Storage, Hazelcast can move data automatically between tiers, between memory and disk.

If your Enterprise license was generated before Hazelcast Platform version 5.2, you’ll need a new Enterprise license that enables the Tiered Storage feature and you will need to run Platform 5.2 or later.

Configuration (YAML) for tiered storage:

hazelcast:
  native-memory:
    enabled: true 
    ... 
  local-device:
    my-disk: 
      base-dir: "tiered-store" 
  map:
    my-map:
      in-memory-format: NATIVE 
      tiered-store:
        enabled: true 
        memory-tier:
          capacity: 
            unit: MEGABYTES
            value: 256
        disk-tier:
          enabled: true 
          device-name: "my-disk"

Tiered Storage is disabled by default; set the “enabled” parameter true to enable it. 

Capacity of the memory in which the frequently accessed data will be stored cannot be set to 0. The default value is 256 MB. Other unit options are BYTES, KILOBYTES, MEGABYTES and GIGABYTES.

Disk-tier enables using disk as an additional (overflow) tier for storage and specifying the name of the device (disk).

For more details on configuring Tiered Storage, you can refer to this page

Wrapping Up

In addition to these features, we’ve made improvements to our CP subsystem and split-brain healing. The automated cluster state management for persistence on Kubernetes was also enhanced to support the cluster-wide shutdown, rolling restart and partial member recovery from failures.

You can read more about what’s new and what’s been fixed in our release notes. What’s more, in GitHub, we’ve closed 250 issues and merged 540 PRs with Hazelcast Platform 5.2.

If you’d like to become more involved in our community or just ask some questions about Hazelcast please join us on our Slack channel,  and also please check out our new Developers homepage.

Again, don’t forget to try this new version and let us know what you think. Our engineers love hearing your feedback and we’re always looking for ways to improve.  You can automatically receive a 30-day Enterprise license key by completing the form at: https://hazelcast.com/trial-request/

Finally, I’d like to thank Frantisek Hartman and Sasha Syrotenko for their contributions to this blog.