Deploying Hazelcast Jet

This guide provides a basic introduction for deploying and operating a successful Hazelcast Jet installation.

We plan to make this guide the first in a series of posts about deploying Hazelcast Jet. This will be a near bare-metal installation, with most of the work done manually. Gradually, we will introduce how to make deployment even easier using integrations with Amazon Web Services and Docker.

Hazelcast Jet supports two modes of deployment: “embedded member”, where the JVM containing application code participates in the Hazelcast Jet cluster directly, and “client-server”, whereby a secondary JVM (which may be on the same host, or on a different one) is responsible for joining the Hazelcast Jet cluster. These two approaches to topology are shown in the following diagrams.

Embedded topology:

Advantages

The main advantage of using embedded topology is simplicity. Because Hazelcast Jet runs in the same JVMs as your application, there are no extra servers to deploy, manage, or maintain. This applies especially when the Hazelcast Jet cluster is tied directly to the embedded application. Since Hazelcast Jet and the application are co-located, reads and writes are faster because another network hop is not required (as opposed to client-server topology).

Disadvantages

When Hazelcast Jet is embedded in an application, it must be started and shut down in concert with its co-resident application instance, and vice-versa. In some cases this may lead to increased operational complexity.

Client-server topology

Advantages

The main advantage of client-server topology is isolation. The Hazelcast Jet cluster will use its own resources instead of sharing it with an application. Therefore, the client application and Hazelcast Jet cluster can be turned on and off independently. Additionally, it’s easier to identify problems if the Hazelcast Jet cluster and application are isolated.

Disadvantages

Because Hazelcast Jet nodes run in dedicated machines, there are costs to deploy, manage, or maintain extra servers.

Installing and Running Hazelcast Jet

This guide assumes that you have four machines that are accessible via SSH. We proceed to set up a 3-node Hazelcast Jet cluster on the machines and then submit the example word-count job from the remaining machine.

Note In this blog post we are going to use iTerm2, a terminal emulator capable of sending keyboard input to multiple sessions. If your terminal doesn’t have this capability you have to apply all of those commands on all of your machines.

The only prerequisite for this guide is to have Java 8 installed on the machines. Check out official [installation guide] (https://www.java.com/en/download/help/linux_install.xml) if you don’t have Java 8 installed.

  1. Download the latest version of Hazelcast Jet, which is 0.4, when this blog post was written:

$ wget http://download.hazelcast.com/jet/hazelcast-jet-0.4.zip

  1. Unzip the archive:
unzip hazelcast-jet-0.4.zip
Archive:  hazelcast-jet-0.4.zip
   creating: hazelcast-jet-0.4/
   creating: hazelcast-jet-0.4/bin/
  inflating: hazelcast-jet-0.4/bin/cluster.sh  
  inflating: hazelcast-jet-0.4/bin/stop-jet.sh  
  inflating: hazelcast-jet-0.4/bin/common.sh  
  inflating: hazelcast-jet-0.4/bin/start-jet.bat  
…
  1. Change the working directory to hazelcast-jet-0.4:
cd hazelcast-jet-0.4

Configuring Discovery

If your network permits multicast communication, all you need to do to form a Hazelcast Jet cluster is start the servers, they will find each other automatically. Most of the cloud infrastructure providers do not allow multicast traffic in their network so we will fallback to use TCP/IP discovery in this post.

To use TCP/IP you need to know the IP addresses or hostnames of the members in advance. Hazelcast provides other options for discovering members via integrating with service discovery brokers or cloud providers. These options don’t require member addresses to be known in advance. We will talk about discovery more in the next blog post.

We will enable TCP/IP discovery by adding the IP addresses of the members to the hazelcast.xml file which can be found in the config directory.

  1. Add the following snippet between the tags of the hazelcast.xml file, which resides in the config directory:
<tcp-ip enabled="true">
    <member>172.30.2.253</member>
    <member>172.30.2.158</member>
    <member>172.30.2.132</member>
</tcp-ip>
  1. Set the enabled property of multicast tag to false:
<multicast enabled="false">
    <multicast-group>224.2.2.3</multicast-group>
    <multicast-port>54327</multicast-port>
</multicast>
  1. The hazelcast.xml files for each member should look like the following:
<hazelcast xsi_schemaLocation="https://hazelcast.com/schema/config hazelcast-config-3.8.xsd"
           
           xmlns_xsi="http://www.w3.org/2001/XMLSchema-instance">
    <group>
        <name>jet</name>
        <password>jet-pass</password>
    </group>
    <network>
        <port auto-increment="true" port-count="100">5701</port>
        <join>
                <multicast enabled="false">
                   <multicast-group>224.2.2.3</multicast-group>
                   <multicast-port>54328</multicast-port>
        </multicast>
        <tcp-ip enabled="true">
            <member>172.30.2.132</member>
            <member>172.30.2.253</member>
            <member>172.30.2.158</member>
        </tcp-ip>
        </join>
    </network>
</hazelcast>

7) Run the start-jet.sh starter script in the ./bin/ directory to start the Hazelcast Jet cluster:

./bin/start-jet.sh

As you can see from the logs, as shown below, they formed a 3-node Hazelcast Jet Cluster.

  • server1 – 172.30.2.132:
...
Previous logs are omitted for brevity
...
Sep 21, 2017 9:39:21 AM com.hazelcast.core.LifecycleService
INFO: [172.30.2.132]:5701 [jet] [0.4] [3.8.2] [172.30.2.132]:5701 is STARTED
Sep 21, 2017 9:39:26 AM com.hazelcast.internal.cluster.ClusterService
INFO: [172.30.2.132]:5701 [jet] [0.4] [3.8.2]

Members [3] {
    Member [172.30.2.132]:5701 - f2ed6338-c2e7-424b-b800-57e3f3c8c5af this
    Member [172.30.2.253]:5701 - 3c122b9e-ada5-4a2b-aa8e-d6af99a55d54
    Member [172.30.2.158]:5701 - c1ae010f-924d-46ec-8583-45397e3f9730
}   
  • server2 – 172.30.2.253:
...
Previous logs are omitted for brevity
...
Sep 21, 2017 9:39:26 AM com.hazelcast.internal.cluster.ClusterService
INFO: [172.30.2.253]:5701 [jet] [0.4] [3.8.2]

Members [3] {
    Member [172.30.2.132]:5701 - f2ed6338-c2e7-424b-b800-57e3f3c8c5af
    Member [172.30.2.253]:5701 - 3c122b9e-ada5-4a2b-aa8e-d6af99a55d54 this
    Member [172.30.2.158]:5701 - c1ae010f-924d-46ec-8583-45397e3f9730
}

Sep 21, 2017 9:39:28 AM com.hazelcast.core.LifecycleService
INFO: [172.30.2.253]:5701 [jet] [0.4] [3.8.2] [172.30.2.253]:5701 is STARTED
  • server3 – 172.30.2.158:
Sep 21, 2017 9:39:26 AM com.hazelcast.internal.cluster.ClusterService
INFO: [172.30.2.158]:5701 [jet] [0.4] [3.8.2]

Members [3] {
    Member [172.30.2.132]:5701 - f2ed6338-c2e7-424b-b800-57e3f3c8c5af
    Member [172.30.2.253]:5701 - 3c122b9e-ada5-4a2b-aa8e-d6af99a55d54
    Member [172.30.2.158]:5701 - c1ae010f-924d-46ec-8583-45397e3f9730 this
}

Sep 21, 2017 9:39:28 AM com.hazelcast.core.LifecycleService
INFO: [172.30.2.158]:5701 [jet] [0.4] [3.8.2] [172.30.2.158]:5701 is STARTED

Now we have a 3-node Hazelcast Jet cluster running. You can either

connect to the cluster using the Client API inside your application, or create an executable JAR file which contains a Hazelcast Jet job, and then submit this JAR file to the cluster via helper scripts.

In this post we are going to do the latter. The following steps will be taken on server4, which is not running anything yet.

  1. Download the job JAR file, via the web, that was prepared for this blog post:
wget https://s3-us-west-2.amazonaws.com/jet-word-count/jet-word-count-0.1-SNAPSHOT.jar
  1. Put the Hazelcast Jet cluster addresses to the hazelcast-client.xml in the config folder. It should look like the following:
<hazelcast-client xsi_schemaLocation="https://hazelcast.com/schema/client-config hazelcast-client-config-3.8.xsd"
                  
                  xmlns_xsi="http://www.w3.org/2001/XMLSchema-instance">
    <group>
        <name>jet</name>
        <password>jet-pass</password>
    </group>
    <network>
        <cluster-members>
            <address>172.30.2.253</address>
            <address>172.30.2.158 </address>
            <address>172.30.2.132</address>
        </cluster-members>
    </network>
</hazelcast-client>
  1. Submit the job using the submit-jet.sh. When the job is submitted with the submit-job.sh, it sends the specified jet-word-count-0.1-SNAPSHOT.jar to the Jet cluster and all the classes inside the JAR file will be available when running the job. There is no need to add the custom classes to the classpath of the Jet cluster; they will be submitted along with the job.
./bin/submit-jet.sh jet-word-count-0.1-SNAPSHOT.jar

As it can be seen from the output below, the job has been submitted and executed on the cluster. The results are shown on the last line of the output.

[ec2-user@ip-172-30-2-45 hazelcast-jet-0.4]$ ./bin/submit-jet.sh jet-word-count-0.1-SNAPSHOT.jar
########################################
# JAVA=/home/ec2-user/jdk1.8.0_144//bin/java
# JAVA_OPTS= -Dhazelcast.config=/home/ec2-user/hazelcast-jet-0.4/config/hazelcast.xml -Dhazelcast.client.config=/home/ec2-user/hazelcast-jet-0.4/config/hazelcast-client.xml -Dhazelcast.jet.config=/home/ec2-user/hazelcast-jet-0.4/config/hazelcast-jet.xml
########################################
Sep 21, 2017 9:52:36 AM com.hazelcast.jet.impl.config.XmlJetConfigLocator
INFO: Loading configuration /home/ec2-user/hazelcast-jet-0.4/config/hazelcast-client.xml from property hazelcast.client.config
Sep 21, 2017 9:52:37 AM com.hazelcast.jet.impl.config.XmlJetConfigLocator
INFO: Using configuration file at /home/ec2-user/hazelcast-jet-0.4/config/hazelcast-client.xml
Sep 21, 2017 9:52:37 AM com.hazelcast.core.LifecycleService
INFO: hz.client_0 [jet] [0.4] [3.8.2] HazelcastClient 3.8.2 (20170518 - a60f944) is STARTING
Sep 21, 2017 9:52:37 AM com.hazelcast.core.LifecycleService
INFO: hz.client_0 [jet] [0.4] [3.8.2] HazelcastClient 3.8.2 (20170518 - a60f944) is STARTED
Sep 21, 2017 9:52:37 AM com.hazelcast.client.spi.impl.ClusterListenerSupport
INFO: hz.client_0 [jet] [0.4] [3.8.2] Trying to connect to [172.30.2.253]:5701 as owner member
Sep 21, 2017 9:52:37 AM com.hazelcast.client.connection.ClientConnectionManager
INFO: hz.client_0 [jet] [0.4] [3.8.2] Setting ClientConnection{alive=true, connectionId=1, socketChannel=DefaultSocketChannelWrapper{socketChannel=java.nio.channels.SocketChannel[connected local=/172.30.2.45:40434 remote=/172.30.2.253:5701]}, remoteEndpoint=[172.30.2.253]:5701, lastReadTime=2017-09-21 09:52:37.798, lastWriteTime=2017-09-21 09:52:37.766, closedTime=never, lastHeartbeatRequested=never, lastHeartbeatReceived=never, connected server version=3.8.2} as owner  with principal ClientPrincipal{uuid='1bb23ea3-37f7-4ad7-9617-b8694b9091ff', ownerUuid='3c122b9e-ada5-4a2b-aa8e-d6af99a55d54'}
Sep 21, 2017 9:52:37 AM com.hazelcast.client.connection.ClientConnectionManager
INFO: hz.client_0 [jet] [0.4] [3.8.2] Authenticated with server [172.30.2.253]:5701, server version:3.8.2 Local address: /172.30.2.45:40434
Sep 21, 2017 9:52:37 AM com.hazelcast.client.spi.impl.ClientMembershipListener
INFO: hz.client_0 [jet] [0.4] [3.8.2]

Members [3] {
    Member [172.30.2.132]:5701 - f2ed6338-c2e7-424b-b800-57e3f3c8c5af
    Member [172.30.2.253]:5701 - 3c122b9e-ada5-4a2b-aa8e-d6af99a55d54
    Member [172.30.2.158]:5701 - c1ae010f-924d-46ec-8583-45397e3f9730
}

Sep 21, 2017 9:52:37 AM com.hazelcast.core.LifecycleService
INFO: hz.client_0 [jet] [0.4] [3.8.2] HazelcastClient 3.8.2 (20170518 - a60f944) is CLIENT_CONNECTED
Sep 21, 2017 9:52:37 AM com.hazelcast.client.connection.ClientConnectionManager
INFO: hz.client_0 [jet] [0.4] [3.8.2] Authenticated with server [172.30.2.158]:5701, server version:3.8.2 Local address: /172.30.2.45:36006
Sep 21, 2017 9:52:38 AM com.hazelcast.client.connection.ClientConnectionManager
INFO: hz.client_0 [jet] [0.4] [3.8.2] Authenticated with server [172.30.2.132]:5701, server version:3.8.2 Local address: /172.30.2.45:35884

[in=2, on=1, was=11, far=1, season=2, winter=1, wisdom=1, comparison=1, age=2,
had=2, epoch=2, good=1, it=10, all=2, its=2, degree=1, spring=1, before=2, were=2,
hope=1, received=1, superlative=1, we=4, short=1, or=1, to=1, nothing=1, period=2, 
being=1, way=1, darkness=1, of=12, incredulity=1, direct=2, like=1, only=1, 
the=14, worst=1, evil=1, everything=1, for=2, light=1, belief=1, present=1, so=1, 
noisiest=1, despair=1, heaven=1, going=2, some=1, that=1, foolishness=1, best=1, 
insisted=1, times=2, authorities=1, us=2, other=1]

We have successfully submitted a simple job and retrieved the results.

  1. The cluster is no longer needed and can be shut down via stop-jet.sh:
./bin/stop-jet.sh

Stay tuned for the next post where we will discuss how to deploy Hazelcast Jet on Amazon EC2 with Docker.