Introducing CLC: The New Hazelcast Command-Line Experience

We’ve been hard at work and are excited for this moment: CLC v5.2.0 is released!

What is Hazelcast CLC?

Hazelcast CLC is a command line tool that empowers developers to do various Hazelcast-related tasks, such as running SQL queries or sending commands to a Hazelcast cluster. Since CLC is a single binary without dependencies, it is small, fast and easy to install.

Before carrying on let’s address one thing: what does “C-L-C” actually stand for? With so many acronyms floating around, it’s easy to get terms mixed up and here, CLC is an acronym for “Command Line Client.” CLC indeed acts like a client to a Hazelcast cluster for many use cases, yet it will have more features in the future which is outside the scope of being a client.

CLC provides three modes of operation that we explain in the Operating Modes section: Command-line (or non-interactive) mode, shell (or interactive) mode, and batch mode.

CLC can connect to Hazelcast Viridian clusters, as well as Hazelcast IMDG 4.x and Hazelcast Platform clusters (5.x and up) running locally or on a remote server. To run SQL queries, the cluster must be version 5.0 or up. JSON support in SQL was introduced at Hazelcast Platform 5.1.

CLC was developed using the Go programming language. that allowed us to make CLC a single self-contained native binary without dependencies. That has several advantages:

  • You do not need JRE or the Hazelcast distribution to run CLC. Just downloading the appropriate binary is sufficient.
  • Being a native binary, CLC starts immediately. That makes it very suitable to use within shell scripts.
  • Since it is a static binary, it doesn’t use or require dynamic libraries. That makes it very convenient to use in Docker scratch images.
  • As it runs in a terminal, CLC can be executed within your favourite IDE.

CLC uses the client/server mode when communicating with a Hazelcast cluster. So it can be confined to a licensed server-side which supports ACL permissions.

Having talked about the benefits, here are the current limitations of CLC:

  • CLC cannot connect to all Hazelcast Platform clusters. For instance, it does not have Kerberos support at the moment.
  • It currently does not support distributed data structures besides the Map.
  • CLC currently cannot decode all values. particularly ones encoded using language-specific serializers, such as Java serialization. It always, helpfully, shows which key/values it cannot decode and their types though.
  • The types that CLC can write is limited at the moment. They are as follows for both Map keys and values: boolean, 32 and 64-bit floats, 8, 16, 32 and 64-bit integers, strings and JSON.

In most cases there is not a fundamental limitation. We will be introducing some of those features gradually.

Use Cases

Let us check out two use cases where CLC is the perfect match. We will have more articles in the future about more complex use cases.

Running SQL Queries

Running SQL is one of the most important use cases for CLC, so it is not a surprise that there’s good support for that.

When you run CLC in the shell mode, you can directly run SQL queries or use shortcut commands like dm which helps with data exploration.

The SQL can be multiline and it is syntax highlighted. You can use the up/down arrow commands to navigate the shell history.

In the shell mode, the results are displayed as tables.

CLC> select __key, name from dessert order by name limit 2;
-----------------------------------------------
    __key | name                            
-----------------------------------------------
        4 | Apple cake                      
        0 | Baklava                         
-----------------------------------------------
OK (19 ms)

Of course you can run SQL queries against streaming data sources as well. The simplest example would be:

CLC> select * from table(generate_stream(1));
---------------------
                v
---------------------
                0
                1
                2
                3
^C---------------------
OK (4370 ms)

Running SQL queries in the command-line mode enables using more output formats, such as JSON. You can feed the output to a specialized tool, like Visidata for easier exploration:

$ clc sql "select * from dessert" -f json -q | visidata -f json

Visidata

Finally, you can write a bunch of SQL statements to a file and run them in batch. That’s a nice and easy way of creating mappings and populating maps. We discuss that more in the Operating Modes section, but here’s a sample for Linux/MacOS:

$ cat my-script.sql | clc

Diagnosing and Fixing Map Entry Problems

Hazelcast Maps do not have a schema, both the keys and values can be in any of the supported types. The keys and values also can be serialized by any serializer, such as Compact Serialization or Avro. This is both a blessing and a curse: It is trivial to write to a Map, but when you want to read the data, you usually should know the type of the object stored there.

That sometimes becomes a challenge when different programs or different versions of the same programs write keys and values using a different type or a serializer. For instance, Program A written in Java sets key “person1.age” to 32 bit integer, but Program B written in Javascript sets the same key to float64. If Program A wants to read the value back, then it would raise an exception, since the type it received is not the same type it expects.

A similar problem happens when you have a SQL mapping on a Map. If the Map is updated with set/put calls and the key or value doesn’t match the mapping, you’ll get an error similar to the one below when you want to read / update even delete the data using SQL:

com.hazelcast.jet.JetException: Execution on a member failed: com.hazelcast.jet.JetException: Exception in ProcessorTasklet{09ae-c83a-f582-0001/Project(IMap[public.dessert])#3}: com.hazelcast.sql.impl.QueryException: Failed to extract map entry key because of type mismatch [expectedClass=java.lang.Integer, actualClass=java.lang.String]

It can be challenging to find out which entries in the Map cause the problem. But CLC makes it trivial to spot the problematic entries with its --show-type flag which can be used with map entry-set and map get commands:

CLC> \map entry-set -n dessert --show-type
-----------------------------------------
    __key | __key_type | this | this_type
-----------------------------------------
        2 | INT32      | >    | JSON
        0 | INT32      | >    | JSON
       14 | INT32      | >    | JSON
  Baklava | STRING     | YUM  | STRING
        6 | INT32      | >    | JSON

In the output above, it is immediately obvious that the key/value types of the entry with key Baklava doesn’t match the others. You may opt to remove that entry altogether:

CLC> \map -n dessert remove -k string Baklava
------
 this
------
YUM

Or replace it if you know the correct value:

CLC> \map -n dessert set -k i32 0 \
    -v json '{"name":"Baklava", "theme":9, "crossteam":0, "jit":1}'

A Tour of CLC

Here’s a short tour of CLC. We will have more articles detailing some of the futures mentioned in this section. In the meanwhile, you can check out our documentation.

Installation

We provide binaries for the popular platforms at our Releases page. Since CLC is a single binary, you can just download the release package for your platform, extract it and optionally move it to somewhere in your PATH.

Currently we provide precompiled binaries of CLC for the following platforms and architectures:

  • Linux/amd64
  • Windows/amd64
  • MacOS/amd64
  • MacOS/arm64

If your platform is not one of the above, you may want to compile CLC yourself. Our build process is very simple and doesn’t have many dependencies. In most cases just running make is sufficient to build CLC if you have the latest Go compiler installed.

Additionally, we provide an installer for Windows 10 and up. The installer can install CLC for either system-wide or just for the user.

Home Directory

CLC keeps all of its files in a well-known location, which we will call $CLC_HOME.
$CLC_HOME includes known configurations, logs and other directories and files.
You can find out $CLC_HOME by running clc home.

Configuration

The configuration contains information about how to connect to a cluster, and other bits of settings.
The directories in $CLC_HOME/configs are named or known configurations. It is adequate to tell about them to CLC using the --config (or -c for short) flag.

You can pass the full absolute/relative path of the configuration to CLC using --config even if that configuration is not a named configuration.

If the configuration was specified, or there’s the default named configuration, CLC tries to load it. Otherwise:

  • In the command-line mode it displays an error
  • In the shell mode, it displays a list of configuration items you can choose from. If there are no configuration, it shows you the config wizard to add a Viridian cluster configuration.

You can check out clc config --help for help on the configuration commands.

Getting Help

You can use the help command or the --help flag to display the help in the command-line mode:


$ clc --help
$ clc map --help
$ clc help map

In the shell mode, try the help and help commands.

Operating Modes

As mentioned before, CLC can operate in one of three modes: command-line, shell or batch.

The command-line mode is suitable to run one-off commands in the terminal, or to use CLC in shell scripts, such as Bash, Powershell, etc. In this mode, CLC exits after completing a single operation.

$ clc map -n dessert size
15
OK

You can use the -q flag to suppress unnecessary output:

$ clc map -n dessert size -q
15

Auto-completion files for CLC commands can be generated for many shells, In order to see your options, check out the output of the clc completion --help command.

In the shell mode, CLC starts a command shell, similar to Python, Bash or Powershell. It displays the CLC> prompt and waits for you to input a SQL statement or a command.

You can enter SQL statements directly, but should add a semicolon so CLC knows the statement ended:

CLC> select * from dessert limit 1;
-----------------------------------------------------------------
    __key | name        |      theme |  crossteam |        jit
-----------------------------------------------------------------
        2 | Carrot cake |          9 |          9 |          1
-----------------------------------------------------------------
OK (13 ms)

CLC commands can be entered by prefixing them with a backslash () character. These commands should fit in the line:

CLC> \version --verbose
-------------------------------------------------------------------
Name                   | Version                                 
-------------------------------------------------------------------
Hazelcast CLC          | v5.2.0                                  
Latest Git Commit Hash | XXXXX
Hazelcast Go Client    | 1.4.0                                   
Go                     | go1.20.2 linux/amd64                    
------------------------------------------------------------------

You can cancel long running operations by pressing Ctrl+C, and exit the shell by pressing Ctrl+D or typing exit.

CLC connects to the cluster once if necessary and stays connected to the cluster until you end the session by exiting the shell. Because of that, interactive mode commands that require a connection to the cluster take shorter to complete.

It is desirable to run a batch of SQL statements and commands from a file in some cases, such as creating mappings and pre-defined data in a cluster.

In order to run a batch file with CLC, just save the SQL statements and commands in a file and pipe it to CLC.

On Linux/MacOS:

$ cat examples/sql/dessert.sql | clc

On Windows:

$ type examplessqldessert.sql | clc

Just like with the shell mode, the batch file can consist of SQL statements ending with semicolumns and commands prefixed with backslash. You can use SQL comments (--) as well. Here are a few lines from dessert.sql which can be found in the CLC repository:


-- (c) 2023, Hazelcast, Inc. All Rights Reserved.

CREATE OR REPLACE MAPPING dessert(
    __key int,
    name varchar,
    theme int,
    crossteam int,
    jit int
) TYPE IMAP OPTIONS (
    'keyFormat' = 'int',
    'valueFormat' = 'json-flat'
);

Output Formats

When it comes to output, one size doesn’t fit all, so CLC supports a few. You can specify the output command using the --format (or -f for shorthand) for all commands.

When using the delimited format, the fields in the output are separated by tab characters. Useful when the output is fed to another command which expects simple text. The output will be trimmed if it doesn’t fit to a single line. This is the default in command-line mode.

$ clc sql "select name, jit from dessert order by name limit 2" -f delimited
Apple cake  0
Baklava 1

With the json format, each row of output is converted to a JSON document. This format is preferred if the output is fed to another command which can read JSON input.

$ clc sql "select name, jit from dessert order by name limit 2" -f json
{"jit":0,"name":"Apple cake"}
{"jit":1,"name":"Baklava"}

CSV is the universal data exchange format, so there are many tools which can read this format, including Microsoft Excel and LibreOffice. Setting the format to csv makes the output a CSV table.

$ clc sql "select name, jit from dessert order by name limit 2" -f csv
name,jit
Apple cake,0
Baklava,1

In order to view the output, table is the best format. This is the default in the shell mode.

$ clc sql -c local "select name, jit from dessert order by name limit 2" -f table
----------------------------------------
name                      |        jit
----------------------------------------
Apple cake                |          0
Baklava                   |          1
----------------------------------------

We’ll end the tour here, but there’s more to CLC which we are going to explore in future articles.

What’s Next?

We have just started and we plan lots of new features and functionality for CLC. One of the most important new features is Jet job submission.

Currently, you can create a Jet streaming pipeline by either executing some SQL (which CLC handles very well) or coding a pipeline with Java and submitting it with the hz-cli tool. The latter way of creating a pipeline is currently not possible with CLC, but I have good news: We are working on it and the new feature will be released with CLC v5.3.0.

Just to be clear, the pipeline still should be written in Java but CLC will be able to submit it as a Jet job. This feature will work for Viridian and Hazelcast Platform v5.3.x and above clusters.

Conclusions

Hazelcast CLC is a command line tool primarily for developers. It starts fast, runs fast and plays well with other command-line tools.

We are always looking for more feedback! Consider joining the #clc channel at Hazelcast Community Slack. You can get an invite at: https://slack.hazelcast.com and participate in our survey.

We recently had a chat about CLC, here’s the recording of the chat.

Some useful links about CLC: