Blog ›Hazelcast Python Client 4.2 is Released!

By Metin Dumandag

Software Engineer

Metin is a Software Engineer working on the Clients Team. Since joining Hazelcast as an intern, he has been working on various clients such as Python, Node.js, and Java.

View all blogs by the author

Jun 24, 2021

Back to Blog

Hazelcast Python Client 4.2 is Released!

We have released the Python client 4.2, featuring the brand new SQL support! You can start using it to query your data efficiently with the widely-known standard SQL syntax.

You can install the latest version through PyPI by issuing the following command.

pip install -U hazelcast-python-client

SQL Support

For a long time, Hazelcast IMDG and the Hazelcast Python client have supported querying the data you store in distributed maps using the Predicates with SQL-like syntax. Yet, the existing query engine had its limitations and there were strong voices from the community for additional features and support for standard SQL syntax.

Listening to these, we introduced a preview version in IMDG 4.1 of the brand-new, state-of-the-art SQL engine. Although there is a long way to go, we have improved it a lot in IMDG 4.2 and started working on the client languages to bring that feature to widespread use.

The new SQL engine:

supports many new expressions.
supports sorting and limiting.
is able to query large amounts of data without putting too much strain on the client and server by fetching the results page by page.
has an option to make use of the new, high-performance, concurrent off-heap B+ tree+
has more advanced query optimizations.

The 4.2 version of the Python client brings these new functionalities to the Python world!

Note: The SQL feature for the Python client and IMDG is in beta state and we only guarantee compatibility with the same minor versions of the client and server.

Usage

For the rest of the post, we will assume a map called employees is filled with values of the following Portable class.

class Employee(Portable):
    FACTORY_ID = 1
    CLASS_ID = 1

    def __init__(self, employee_id=None, first_name=None, last_name=None, age=None, salary=None):
        self.employee_id = employee_id
        self.first_name = first_name
        self.last_name = last_name
        self.age = age
        self.salary = salary

    def write_portable(self, writer):
        writer.write_int("employee_id", self.employee_id)
        writer.write_string("first_name", self.first_name)
        writer.write_string("last_name", self.last_name)
        writer.write_int("age", self.age)
        writer.write_int("salary", self.salary)

    def read_portable(self, reader):
        self.employee_id = reader.read_int("employee_id")
        self.first_name = reader.read_string("first_name")
        self.last_name = reader.read_string("last_name")
        self.age = reader.read_int("age")
        self.salary = reader.read_int("salary")

    def get_factory_id(self):
        return Employee.FACTORY_ID

    def get_class_id(self):
        return Employee.CLASS_ID


client = hazelcast.HazelcastClient(portable_factories={
    Employee.FACTORY_ID: {
        Employee.CLASS_ID: Employee,
    },
})

employees = client.get_map("employees").blocking()

# Fill the employees map

employees.set(1, Employee(1, "John", "Doe", 42, 60000))
...

In Python client 4.2, there is a new service called SQL, that is capable of executing queries on distributed maps.

You can simply use the execute() method to run queries, by just passing the query string and parameters, if any. Note that, the parameters are specified with the question marks in query strings.

result = client.sql.execute("SELECT employee_id, first_name FROM employees WHERE age < ?", 50)

Or, you can construct a SqlStatement to have a greater amount of control over the query execution by configuring cursor buffer size (page size), timeout, expected result type, and parameters. A SqlStatement can be passed into the execute_statement() method.

statement = SqlStatement("SELECT * FROM employees")
statement.cursor_buffer_size = 200

result = client.sql.execute_statement(statement)

Both of these methods return an SqlResult, which implements the iterator interface for ease of use. You can iterate over the rows of the result and get the column values of rows either by the column name or index.

result = client.sql.execute("SELECT employee_id, first_name FROM employees WHERE age < ?", 50)

for row in result:
    # Get the column value with the column name
    employee_id = row.get_object("employee_id")

    # Find the column index and get the column value with that index.
    # You can also know the column indexes beforehand, as they have the
    # same order as the order given in the query string.
    row_metadata = row.metadata
    first_name_index = row_metadata.find_column("first_name")
    first_name = row.get_object_with_index(first_name_index)

    print(employee_id, first_name)

SqlResult also has the following methods:

.is_row_set() that returns whether or not the result has rows to iterate.
.update_count() that returns the number of rows affected by the query or -1, if the result is a row-set.
.get_row_metadata() that returns the metadata about the rows where you can get the column metadata, count, and indexes.
.close() that closes the query execution and makes sure that the resources associated with the query on the server-side are released.
iterator() to iterate over the result in a non-blocking manner. Most of the time, the iteration method shown above should be used, instead of this. See this code sample to understand how to use it.

All of these methods return Future objects, and you must call .result() on them to wait for their results or add callbacks that will run when their result is set.

Sorting

You can use the standard SQL clauses ORDER BY, LIMIT, and OFFSET to sort and limit the result set.

Note that, you must have SORTED indexes on the fields that you ORDER BY. You can add indexes dynamically with the client as below, ideally before putting data into the map.

employees.add_index(["salary"], "SORTED")

The following code snippet gets the top five employees with the highest salaries. The code snippet also shows the usage of the with statement (context manager) to automatically call .close() on the result to make sure that the server-side resources are cleaned properly.

with client.sql.execute("""
SELECT
    employee_id, first_name, last_name, salary
FROM
    employees
ORDER BY salary DESC
LIMIT 5
""") as result:
    for row in result:
        print(row)

Unsupported Features

As of IMDG 4.2, we only support SELECT clauses but plan to add support for the following features in upcoming releases:

INSERT / UPDATE / DELETE
JOIN
GROUP BY / HAVING
Set operators (UNION, INTERSECT, MINUS)
Subqueries (SELECT … FROM table WHERE x = (SELECT …))

See the following links for more information about the SQL feature.

Other Enhancements and Bug Fixes

As usual, the 4.2 Python client release includes some enhancements and bug fixes, apart from the SQL implementation. Refer to the release notes for more information.

Closing Words

We are very excited about the SQL feature and think that it is a great addition to the capabilities of the IMDG and the Python client. We will continue working hard to improve it. In the meantime, don’t hesitate to try it and share your feedback with us.

We will be waiting for your feedback on the Slack channel, Google Group, or Github repository. If you want to contribute to the Python client, please take a look at our issue list or talk with us through the communication channels listed above.

Stay safe, be well, and see you soon!

Keep Reading

Blog

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Network hops cost milliseconds; milliseconds cost money. Put data, compute, and AI on one platform, and both bills shrink, whether…

Blog

Understanding the Value of Distributed Compute

Introduction Hazelcast is a powerful platform. It delivers the power of a highly reliable, distributed cache. Equally important is the…

Blog

Resilience That Holds Under Load: Hazelcast Platform 5.7

A major release for institutions where the operational state must remain correct during degradation, not just be restored afterward. The…

Blog

Testing distributed resilient applications powered by Hazelcast

Applications powered by Hazelcast and that use it to drive business logic need tests that go beyond happy-path validation. Serialization,…

Datasheet

/ PDF

/ 2 pages

Resilient, Continuous, Active Data – without Compromise Datasheet

The unified in-memory and stream processing platform for resilient, continuous active data at sub-millisecond speed.

Webinar

/ Video

/ 45 min

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Zero-downtime upgrades aren’t the hard part—schema evolution is. Learn how mixed service versions interact with shared cached data, why subtle inconsistencies cause failures, and how to design forward-compatible changes using Hazelcast and real Java examples.

Platform

Cloud Deployment Options

Key Solutions

By Industry

By Use Case

By Architecture

A cloud-agnostic architecture for your applications

Resource Center

Content Types

Learn

33% Reduction in Operational Costs

Developers

Community

Learn

Toolbox

A cloud-agnostic architecture for your applications

By Metin Dumandag

Hazelcast Python Client 4.2 is Released!

SQL Support

Usage

Sorting

Unsupported Features

Other Enhancements and Bug Fixes

Closing Words

Keep Reading

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Understanding the Value of Distributed Compute

Resilience That Holds Under Load: Hazelcast Platform 5.7

Testing distributed resilient applications powered by Hazelcast

Resilient, Continuous, Active Data – without Compromise Datasheet

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect

Platform

Cloud Deployment Options

Key Solutions

By Industry

By Use Case

By Architecture

A cloud-agnostic architecture for your applications

Resource Center

Content Types

Learn

33% Reduction in Operational Costs

Developers

Community

Learn

Toolbox

A cloud-agnostic architecture for your applications

By Metin Dumandag

Spread the Word

Hazelcast Python Client 4.2 is Released!

SQL Support

Usage

Sorting

Unsupported Features

Other Enhancements and Bug Fixes

Closing Words

Keep Reading

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Understanding the Value of Distributed Compute

Resilience That Holds Under Load: Hazelcast Platform 5.7

Testing distributed resilient applications powered by Hazelcast

Resilient, Continuous, Active Data – without Compromise Datasheet

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect